I'm currently making use of a node.js plugin called s3-upload-stream to stream very large files to Amazon S3. It uses the multipart API and for the most part it works very well.
However, this module is showing its age and I've already had to make modifications to it (the author has deprecated it as well). Today I ran into another issue with Amazon, and I would really like to take the author's recommendation and start using the official aws-sdk to accomplish my uploads.
BUT.
The official SDK does not seem to support piping to s3.upload()
. The nature of s3.upload is that you have to pass the readable stream as an argument to the S3 constructor.
I have roughly 120+ user code modules that do various file processing, and they are agnostic to the final destination of their output. The engine hands them a pipeable writeable output stream, and they pipe to it. I cannot hand them an AWS.S3
object and ask them to call upload()
on it without adding code to all the modules. The reason I used s3-upload-stream
was because it supported piping.
Is there a way to make aws-sdk s3.upload()
something I can pipe the stream to?
You can set up the Kinesis Stream to S3 to start streaming your data to Amazon S3 buckets using the following steps: Step 1: Signing in to the AWS Console for Amazon Kinesis. Step 2: Configuring the Delivery Stream. Step 3: Transforming Records using a Lambda Function.
client.upload(destination) Create an upload stream that will upload to the specified destination. The upload stream is returned immeadiately. The destination details is an object in which you can specify many different destination properties enumerated in the AWS S3 documentation.
Nowadays, there is a growing demand for serverless architecture, which makes uploading files to AWS S3 using API gateway with AWS Lambda (NodeJs) extremely useful. By simply following the above steps, you can make your own API to upload your files to S3 buckets on AWS.
Wrap the S3 upload()
function with the node.js stream.PassThrough()
stream.
Here's an example:
inputStream
.pipe(uploadFromStream(s3));
function uploadFromStream(s3) {
var pass = new stream.PassThrough();
var params = {Bucket: BUCKET, Key: KEY, Body: pass};
s3.upload(params, function(err, data) {
console.log(err, data);
});
return pass;
}
A bit late answer, it might help someone else hopefully. You can return both writeable stream and the promise, so you can get response data when the upload finishes.
const AWS = require('aws-sdk');
const stream = require('stream');
const uploadStream = ({ Bucket, Key }) => {
const s3 = new AWS.S3();
const pass = new stream.PassThrough();
return {
writeStream: pass,
promise: s3.upload({ Bucket, Key, Body: pass }).promise(),
};
}
And you can use the function as follows:
const { writeStream, promise } = uploadStream({Bucket: 'yourbucket', Key: 'yourfile.mp4'});
const readStream = fs.createReadStream('/path/to/yourfile.mp4');
const pipeline = readStream.pipe(writeStream);
Now you can either check promise:
promise.then(() => {
console.log('upload completed successfully');
}).catch((err) => {
console.log('upload failed.', err.message);
});
Or using async/await:
try {
await promise;
console.log('upload completed successfully');
} catch (error) {
console.log('upload failed.', error.message);
}
Or as stream.pipe()
returns stream.Writable, the destination (writeStream variable above), allowing for a chain of pipes, we can also use its events:
pipeline.on('close', () => {
console.log('upload successful');
});
pipeline.on('error', (err) => {
console.log('upload failed', err.message)
});
In the accepted answer, the function ends before the upload is complete, and thus, it's incorrect. The code below pipes correctly from a readable stream.
Upload reference
async function uploadReadableStream(stream) {
const params = {Bucket: bucket, Key: key, Body: stream};
return s3.upload(params).promise();
}
async function upload() {
const readable = getSomeReadableStream();
const results = await uploadReadableStream(readable);
console.log('upload complete', results);
}
You can also go a step further and output progress info using ManagedUpload
as such:
const manager = s3.upload(params);
manager.on('httpUploadProgress', (progress) => {
console.log('progress', progress) // { loaded: 4915, total: 192915, part: 1, key: 'foo.jpg' }
});
ManagedUpload reference
A list of available events
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With