Skip to main content
Latest Documentation
This is the latest documentation for the Cloud Posse Reference Architecture. To determine which version you're currently using, please see Version Identification.

Decide on Seeding Strategy for Staging Environments

Problem

Longer-lived staging environments need a dataset that closely resembles production. If this dataset becomes stale, we’ll not be effectively testing releases before they hit production. Restoring snapshots from production is not recommended.

Considerations

  • Should contain anonymized users, invalid email addresses

  • No CHD, PHI, PII must be contained in the database

  • The scale of data should be close to the production database

  • Snapshots from production are dangerous if not anonymized/scrubbed (imagine the risk of sending emails to everyone from your staging env)

  • Fixtures are not recommended (scale of data for fixtures usually does not represent production)

  • We recommend including the DBA in these conversations.

  • QA teams want stable data so that they can run through their test scenarios

Recommendations

caution

Cloud Posse does not have a turnkey solution for seeding staging environments

  • ETL pipeline scrubs the data and refreshes the database weekly or monthly. (e.g. AWS Glue, GitHub Action Schedule Job)