I want to write a Lambda function to retrieve an s3 object, download it to the /tmp folder of the function, then run the crypto.createHash(algorithm) on the object. But I need a solution for object size over 500MB since that's a Lambda ephemeral storage limit. Is there any workaround for this? Also what if the object I am retrieving is in Glacier-storage class from the bucketpolicy how do I download it using Lambda? would I need one lambda for retrieval and another for download? any help is appreciated, thanks!
Update: You can now provision up to 10GB of ephemeral storage with your Lambda functions.
You should be able to stream the contents of an S3 object into your hash algorithm without ever having to store it in the /tmp
folder. You shouldn't have to store it to the local disk at all.
Regarding files stored in Glacier, since that can take so long yes you would have to trigger the restore from one function invocation, and then trigger another invocation to compute the hash once the object has been restored.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With