How to save data streams in S3? aws-sdk-go example not working?

Tags:

I am trying to persist a given stream of data to an S3 compatible storage. The size is not known before the stream ends and can vary from 5MB to ~500GB.

I tried different possibilities but did not find a better solution than to implement sharding myself. My best guess is to make a buffer of a fixed size fill it with my stream and write it to the S3. Is there a better solution? Maybe a way where this is transparent to me, without writing the whole stream to memory?

The aws-sdk-go readme has an example programm that takes data from stdin and writes it to S3: https://github.com/aws/aws-sdk-go#using-the-go-sdk

When I try to pipe data in with a pipe | I get the following error: failed to upload object, SerializationError: failed to compute request body size caused by: seek /dev/stdin: illegal seek Am I doing something wrong or is the example not working as I expect it to?

I although tried minio-go, with PutObject() or client.PutObjectStreaming(). This is functional but consumes as much memory as the data to store.

Is there a better solution?
Is there a small example program that can pipe arbitrary data into S3?

743

asked Apr 24 '17 19:04

xxorde

1 Answers

You can use the sdk's Uploader to handle uploads of unknown size but you'll need to make the os.Stdin "unseekable" by wrapping it into an io.Reader. This is because the Uploader, while it requires only an io.Reader as the input body, under the hood it does a check to see whether the input body is also a Seeker and if it is, it does call Seek on it. And since os.Stdin is just an *os.File which implements the Seeker interface, by default, you would get the same error you got from PutObjectWithContext.

The Uploader also allows you to upload the data in chunks whose size you can configure and you can also configure how many of those chunks should be uploaded concurrently.

Here's a modified version of the linked example, stripped off of code that can remain unchanged.

package main

import (
    // ...
    "io"
    "github.com/aws/aws-sdk-go/service/s3/s3manager"
)

type reader struct {
    r io.Reader
}

func (r *reader) Read(p []byte) (int, error) {
    return r.r.Read(p)
}

func main() {
    // ... parse flags

    sess := session.Must(session.NewSession())
    uploader := s3manager.NewUploader(sess, func(u *s3manager.Uploader) {
        u.PartSize = 20 << 20 // 20MB
        // ... more configuration
    })

    // ... context stuff

    _, err := uploader.UploadWithContext(ctx, &s3manager.UploadInput{
        Bucket: aws.String(bucket),
        Key:    aws.String(key),
        Body:   &reader{os.Stdin},
    })

    // ... handle error
}

As to whether this is a better solution than minio-go I do not know, you'll have to test that yourself.

194

answered Sep 19 '22 13:09

mkopriva

Related questions
                            
                                Go hijack client connection
                            
                                How to write out ecdsa keys using golang crypto?
                            
                                Counting hard links to a file in Go
                            
                                Directory structure for Go web app
                            
                                Will the login functionality built into the Users API stop working on April 20th?
                            
                                Deploying Golang web app static files with Docker container
                            
                                How are golang projects packaged for deployment?
                            
                                With Golang Templates how can I set a variable in each template?
                            
                                Using Go 1.5 buildmode=c-archive with net/http.Server linked from C
                            
                                How to read packed binary data in Go?
                            
                                Golang testing gin based REST API not getting params while using net/http/httptest
                            
                                How to specify the file location for `template.ParseFiles` in Go Language?
                            
                                redigo: getting dial tcp: connect: cannot assign requested address
                            
                                Go: Dynamic type cast/assertion of struct's with interface (to call methods and use struct commons)
                            
                                Golang type assertion with pointers
                            
                                Stop highlighting trailing whitespace for Go files in Vim
                            
                                golang os *File.Readdir using lstat on all files. Can it be optimised?
                            
                                Config file with cobra and viper
                            
                                Idiomatic way to initialise an empty string in Go
                            
                                Difference between golang.org packages and the standard library

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to save data streams in S3? aws-sdk-go example not working?

Tags:

go

amazon-s3

aws-sdk-go

xxorde

People also ask

1 Answers

mkopriva

Recent Activity

Donate For Us