Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pipe a stream to s3.upload()

I'm currently making use of a node.js plugin called s3-upload-stream to stream very large files to Amazon S3. It uses the multipart API and for the most part it works very well.

However, this module is showing its age and I've already had to make modifications to it (the author has deprecated it as well). Today I ran into another issue with Amazon, and I would really like to take the author's recommendation and start using the official aws-sdk to accomplish my uploads.

BUT.

The official SDK does not seem to support piping to s3.upload(). The nature of s3.upload is that you have to pass the readable stream as an argument to the S3 constructor.

I have roughly 120+ user code modules that do various file processing, and they are agnostic to the final destination of their output. The engine hands them a pipeable writeable output stream, and they pipe to it. I cannot hand them an AWS.S3 object and ask them to call upload() on it without adding code to all the modules. The reason I used s3-upload-stream was because it supported piping.

Is there a way to make aws-sdk s3.upload() something I can pipe the stream to?

like image 555
womp Avatar asked May 20 '16 00:05

womp


People also ask

Can you stream to S3?

You can set up the Kinesis Stream to S3 to start streaming your data to Amazon S3 buckets using the following steps: Step 1: Signing in to the AWS Console for Amazon Kinesis. Step 2: Configuring the Delivery Stream. Step 3: Transforming Records using a Lambda Function.

How do I upload streams to my Galaxy S3?

client.upload(destination) Create an upload stream that will upload to the specified destination. The upload stream is returned immeadiately. The destination details is an object in which you can specify many different destination properties enumerated in the AWS S3 documentation.

Can Lambda upload file to S3?

Nowadays, there is a growing demand for serverless architecture, which makes uploading files to AWS S3 using API gateway with AWS Lambda (NodeJs) extremely useful. By simply following the above steps, you can make your own API to upload your files to S3 buckets on AWS.


3 Answers

Wrap the S3 upload() function with the node.js stream.PassThrough() stream.

Here's an example:

inputStream
  .pipe(uploadFromStream(s3));

function uploadFromStream(s3) {
  var pass = new stream.PassThrough();

  var params = {Bucket: BUCKET, Key: KEY, Body: pass};
  s3.upload(params, function(err, data) {
    console.log(err, data);
  });

  return pass;
}
like image 153
Casey Benko Avatar answered Oct 23 '22 02:10

Casey Benko


A bit late answer, it might help someone else hopefully. You can return both writeable stream and the promise, so you can get response data when the upload finishes.

const AWS = require('aws-sdk');
const stream = require('stream');

const uploadStream = ({ Bucket, Key }) => {
  const s3 = new AWS.S3();
  const pass = new stream.PassThrough();
  return {
    writeStream: pass,
    promise: s3.upload({ Bucket, Key, Body: pass }).promise(),
  };
}

And you can use the function as follows:

const { writeStream, promise } = uploadStream({Bucket: 'yourbucket', Key: 'yourfile.mp4'});
const readStream = fs.createReadStream('/path/to/yourfile.mp4');

const pipeline = readStream.pipe(writeStream);

Now you can either check promise:

promise.then(() => {
  console.log('upload completed successfully');
}).catch((err) => {
  console.log('upload failed.', err.message);
});

Or using async/await:

try {
    await promise;
    console.log('upload completed successfully');
} catch (error) {
    console.log('upload failed.', error.message);
}

Or as stream.pipe() returns stream.Writable, the destination (writeStream variable above), allowing for a chain of pipes, we can also use its events:

 pipeline.on('close', () => {
   console.log('upload successful');
 });
 pipeline.on('error', (err) => {
   console.log('upload failed', err.message)
 });
like image 163
Ahmet Cetin Avatar answered Oct 23 '22 03:10

Ahmet Cetin


In the accepted answer, the function ends before the upload is complete, and thus, it's incorrect. The code below pipes correctly from a readable stream.

Upload reference

async function uploadReadableStream(stream) {
  const params = {Bucket: bucket, Key: key, Body: stream};
  return s3.upload(params).promise();
}

async function upload() {
  const readable = getSomeReadableStream();
  const results = await uploadReadableStream(readable);
  console.log('upload complete', results);
}

You can also go a step further and output progress info using ManagedUpload as such:

const manager = s3.upload(params);
manager.on('httpUploadProgress', (progress) => {
  console.log('progress', progress) // { loaded: 4915, total: 192915, part: 1, key: 'foo.jpg' }
});

ManagedUpload reference

A list of available events

like image 62
Taku Avatar answered Oct 23 '22 03:10

Taku