Hey everyone! I get that it's often not recommended to use production data for development due to security and legal reasons, but sometimes it's necessary to ensure everything works correctly. I'm looking for the quickest way to replicate a CloudNativePG production database cluster into a development cluster so I can test with real data. Are there any tools or methods that can simplify this process?
2 Answers
What you want is a staging environment that's specifically for this kind of testing. It could either have production data (though that's not the best idea), anonymized data from backups, or a large set of fake data. In previous companies, we had scripts for anonymizing data and strict policies on what could be shared. You really should have a strong justification for needing production data—consider if there's something else you can generate instead.
Honestly, you really shouldn't need production data at all. It's a better practice for developers to create mock data or use stubs that simulate client scenarios. If your team doesn't have a solid testing policy in place, it could lead to issues down the line. It's crucial to establish those guidelines sooner rather than later!
Could you share how to anonymize the data or any resources to help with that?