I would like to grab a file straight of the Internet and stick it into an S3 bucket to then copy it over to a PIG cluster. Due to the size of the file and my not so good internet connection downloading the file first onto my PC and then uploading it to Amazon might not be an option.
Is there any way I could go about grabbing a file of the internet and sticking it directly into S3?
Currently, there is no way of uploading objects directly to S3 Glacier using a Snowball Edge. Thus, you first have to upload your objects into S3 Standard, and then use S3 lifecycle policies to transition the files to S3 Glacier.
In the Amazon S3 console, choose the bucket where you want to upload an object, choose Upload, and then choose Add Files. In the file selection dialog box, find the file that you want to upload, choose it, choose Open, and then choose Start Upload. You can watch the progress of the upload in the Transfer pane.
the main purpose of this code block is to generate pdf file from html and upload it to s3 directly. the file will not be stored in your directory.
Download the data via curl
and pipe the contents straight to S3. The data is streamed directly to S3 and not stored locally, avoiding any memory issues.
curl "https://download-link-address/" | aws s3 cp - s3://aws-bucket/data-file
As suggested above, if download speed is too slow on your local computer, launch an EC2 instance, ssh
in and execute the above command there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With