What’s the quickest way to create a Postgres Aurora instance with obfuscated production data?

0
0
Asked By TechSavvy123 On

I'm looking to speed up our process for creating a Postgres Aurora instance that uses obfuscated production data. Currently, we take full production snapshots, which include unnecessary data and empty space. The obfuscation job then restores these snapshots, scrubs sensitive information with SQL updates, and creates a new snapshot for use across dev and QA environments.

Since it's a monolithic database, I'm considering two ideas to enhance speed: using `pg_dump` instead of our full snapshot procedure or running a `VACUUM FULL` to shrink our obfuscation cluster storage prior to taking the final snapshot. For reference, our current stats show that a compressed `pg_dump` is around 15 GB, while RDS snapshots vary from 200 to 500 GB. Also, restoring a snapshot on Graviton RDS takes at least an hour, though it's quicker on Aurora Serverless v2.

So my question is: Is it a better route to pursue `pg_dump` for a faster restore, or should I focus on optimizing the obfuscation process and resizing the snapshot to a more manageable size, like 50 GB? And just to clarify, I'd prefer not to delve into splitting the database into microservices unless there's no other option.

4 Answers

Answered By DataGuru84 On

You might want to try creating a read replica, promoting it, running your obfuscation process, and then creating a `pg_dump` snapshot. This could be a way to streamline the process.

Answered By CloudWizard99 On

If time is a factor, consider automating your process to run on a schedule. Restoring to a quicker RDS instance could help speed things up too, and once done, you can switch back to your needed configuration.

Answered By AutomationNinja On

I suggest sticking with native snapshots and automating the workflow using AWS Lambda and Step Functions. This way, you avoid manual interventions, and honestly, a few hours isn’t a bad trade-off for non-production environments.

Answered By DevOpsDynamo On

Have you considered using AWS Database Migration Service? It allows for obfuscation while feeding into your lower environment RDS endpoints, which could simplify your workflow.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.