PostgreSQL RDS snapshot to S3

A really nice new feature to RDS is the capability of taking a snapshot of your data to S3 and it being stored in parquet format, we have a use case to try and get insight to some of the data we have in our RDS database so a nice cheap risk free way of doing this is dumping this data to s3, use glue to catalog the data, Athena to query the data and quicksight to give some nice visualisations. I’ll show the steps we took to dump the data out using aws cli.

aws rds create-db-snapshot --db-instance-identifier testing --db-snapshot-identifier test-snapshot

Creating the snapshot of the RDS instance

Export the snapshot to S3

aws rds start-export-task --export-task-identifier test-snapshot --source-arn arn:aws:rds:eu-west-1:123456789:snapshot:test-snapshot --s3-bucket-name testbucket --iam-role-arn arn:aws:iam::123456789:role/rds-export-s3-role --kms-key-id keyarn

{ 
"ExportTaskIdentifier": "test-snapshot", 
"SourceArn": "arn:aws:rds:eu-west-1:123456789:snapshot:test-snapshot", 
"SnapshotTime": "2020-07-13T12:26:28.870000+00:00", "S3Bucket": "dataplatform-mcc-input-686794321847", "IamRoleArn":"arn:aws:iam::686794321847:role/cdl/dbateam/dba-rds-export-s3-role"
"KmsKeyId": "arn:aws:kms:eu-west:686794321847:key/da631cd0-cfc1-42c5-b2f7-0f35509d850d", 
"Status": "STARTING", 
"PercentProgress": 0, 
"TotalExtractedDataInGB": 0 
}

For the Love of Data

PostgreSQL RDS snapshot to S3

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply