I need to copy 92 million objects from bucket A to bucket B in the same AWS region. I know AWS takes 48 hours to generate an S3 Inventory report, so ... I'm wondering how long it takes to read a manifest of 92 million objects and copy them to another bucket. My object files have an average size of 512KB.
One option is to use S3DistCp - Amazon EMR, which fires off many parallel copy commands from a Hadoop cluster.
This involves a fair bit of overhead (using an Amazon EMR cluster), but once going it can copy the files quite rapidly by generating parallel API requests to copy the objects.
If you are going to instigate the copy yourself, then you could try something similar to generate many parallel copy requests, rather than simply looping through the list sequentially.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With