How to Set Up CDC from Aurora PostgreSQL to S3 for Compliance?

0
21
Asked By CuriousCoder92 On

I'm looking for a method to capture all the INSERTS, UPDATES, and DELETIONS from my Aurora PostgreSQL database directly into S3 in Parquet format. This would be for compliance reasons and for historical analytics, essentially implementing Slowly Changing Dimension (SCD) Type 2 for all tables. It seems like using AWS Database Migration Service (DMS) with Change Data Capture (CDC) would be a good choice since it allows for wildcard patterns to automatically manage table captures without the hassle of individual configurations. However, I'm worried that DMS, which is typically seen as a tool for one-time migrations, might not be suitable for long-term continuous operation. Is there a built-in solution from AWS that addresses this issue? I'm hoping to avoid custom coding for each table or any issues with atomicity related to the services that interact with the database.

1 Answer

Answered By DBGuru77 On

Using DMS for ongoing replication from Aurora to S3 in Parquet format has worked well for us, with very few issues encountered along the way. It's definitely designed with CDC in mind.

DataDynamo16 -

That sounds just like what I need! Would you be able to share how you set it up? Also, do you partition the Parquet files by date? Is it reliant on Lambda like many AWS setups?

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.