Hey everyone, I'm trying to figure out the best way to set up a database environment for my software team. I want the developers to be able to work with production data in a dev environment, with each developer having their own database instance. I've thought about creating a custom database image, but I keep reading that it's not recommended. I'm leaning towards implementing daily backups of the database and using Docker to restore these in each developer's environment, but I'm struggling with the setup since no one on the team is familiar with shell scripts. Does anyone have alternative strategies for doing this effectively?
4 Answers
I really advise against working directly on production data. It's risky! You might want to consider setting up a separate test server that uses a copy of the production database instead. That way, it minimizes the risk of data theft and ensures that developers aren’t messing with actual production data. Plus, have you thought about using mocked or anonymized data? It could be a better option to avoid these concerns altogether.
Local Docker and restoring your production database sounds like the way to go. Just point to a backup file in your Docker Compose and run it during 'docker-compose up.' If your production database is too large, consider using a subset of its data for testing. Be cautious, though; sensitive data like emails and passwords shouldn’t be accessible in a test environment even if you think it’s safe.
Honestly, you probably don’t need a full local environment. Use a controlled test engine and run integration tests with a minimal dataset during your build phase. You could maybe implement in-memory databases or test containers to speed things up. If a common dev environment is needed later, then ensure you have a separate database for that. Consider running a pipeline to refresh your test data regularly to keep it up to date.
In my previous jobs, we always created separate databases for development where we imported production data. This kept a clear boundary between production and dev. Sometimes, we set up local databases with just the data needed for specific tasks, automating the import process when possible. Just make sure you’re complying with legal and company regulations regarding data handling.
That makes sense! Can you clarify what you mean by creating special databases? I’d love to hear more about your process.

For this project, we don’t really have issues with data privacy, so having access to some form of production data would actually be beneficial. We're trying to find a way to implement Docker since this could help us avoid conflicts between developers. I get that a local SQL server is an option, but we're looking for other solutions too.