A really nice new feature to RDS is the capability of taking a snapshot of your data to S3 and it being stored in parquet format, we have a use case to try and get insight to some of the data we have in our RDS database so a nice cheap risk free way of doing this is dumping this data to s3, use glue to catalog the data, Athena to query the data and quicksight to give some nice visualisations. I’ll show the steps we took to dump the data out using aws cli.
aws rds create-db-snapshot --db-instance-identifier testing --db-snapshot-identifier test-snapshot
Creating the snapshot of the RDS instance
Export the snapshot to S3
aws rds start-export-task --export-task-identifier test-snapshot --source-arn arn:aws:rds:eu-west-1:123456789:snapshot:test-snapshot --s3-bucket-name testbucket --iam-role-arn arn:aws:iam::123456789:role/rds-export-s3-role --kms-key-id keyarn { "ExportTaskIdentifier": "test-snapshot", "SourceArn": "arn:aws:rds:eu-west-1:123456789:snapshot:test-snapshot", "SnapshotTime": "2020-07-13T12:26:28.870000+00:00", "S3Bucket": "dataplatform-mcc-input-686794321847", "IamRoleArn":"arn:aws:iam::686794321847:role/cdl/dbateam/dba-rds-export-s3-role" "KmsKeyId": "arn:aws:kms:eu-west:686794321847:key/da631cd0-cfc1-42c5-b2f7-0f35509d850d", "Status": "STARTING", "PercentProgress": 0, "TotalExtractedDataInGB": 0 }