Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does file uploaded to S3 have content type application/octet-stream unless I name the file .html?

Even though I set content type to text/html it ends up as application/octet-stream on S3.

ByteArrayInputStream contentsAsStream = new ByteArrayInputStream(contentAsBytes); ObjectMetadata md = new ObjectMetadata(); md.setContentLength(contentAsBytes.length); md.setContentType("text/html"); s3.putObject(new PutObjectRequest(ARTIST_BUCKET_NAME, artistId, contentsAsStream, md));             

If however I name the file so that it ends up with .html

s3.putObject(new PutObjectRequest(ARTIST_BUCKET_NAME, artistId + ".html", contentsAsStream, md)); 

then it works.

Is my md object just being ignored? How can I get round this programmatically as over time I need to upload thousands of files so cannot just go into S3 UI and manually fix the contentType.

like image 956
Paul Taylor Avatar asked Jun 08 '15 10:06

Paul Taylor


People also ask

What is application octet stream content type?

The application/octet-stream MIME type is used for unknown binary files. It preserves the file contents, but requires the receiver to determine file type, for example, from the filename extension. The Internet media type for an arbitrary byte stream is application/octet-stream .

What type of files can be uploaded to S3 bucket?

You can upload any file type—images, backups, data, movies, etc. —into an S3 bucket. The maximum size of a file that you can upload by using the Amazon S3 console is 160 GB. To upload a file larger than 160 GB, use the AWS CLI, AWS SDK, or Amazon S3 REST API.

Does S3 overwrite file with same name?

By default, when you upload the file with same name. It will overwrite the existing file. In case you want to have the previous file available, you need to enable versioning in the bucket.


1 Answers

You must be doing something else in your code. I just tried your code example using the 1.9.6 S3 SDK and the file gets the "text/html" content type.

Here's the exact (Groovy) code:

class S3Test {     static void main(String[] args) {          def s3 = new AmazonS3Client()          def random = new Random()         def bucketName = "raniz-playground"         def keyName = "content-type-test"          byte[] contentAsBytes = new byte[1024]         random.nextBytes(contentAsBytes)          ByteArrayInputStream contentsAsStream = new ByteArrayInputStream(contentAsBytes);         ObjectMetadata md = new ObjectMetadata();         md.setContentLength(contentAsBytes.length);         md.setContentType("text/html");         s3.putObject(new PutObjectRequest(bucketName, keyName, contentsAsStream, md))          def object = s3.getObject(bucketName, keyName)         println(object.objectMetadata.contentType)         object.close()     } } 

The program prints

text/html

And the S3 metadata says the same:

S3 properties view

Here are the communication sent over the net (courtesy of Apache HTTP Commons debug logging):

>> PUT /content-type-test HTTP/1.1 >> Host: raniz-playground.s3.amazonaws.com >> Authorization: AWS <nope> >> User-Agent: aws-sdk-java/1.9.6 Linux/3.2.0-84-generic Java_HotSpot(TM)_64-Bit_Server_VM/25.45-b02/1.8.0_45 >> Date: Fri, 12 Jun 2015 02:11:16 GMT >> Content-Type: text/html >> Content-Length: 1024 >> Connection: Keep-Alive >> Expect: 100-continue << HTTP/1.1 200 OK << x-amz-id-2: mOsmhYGkW+SxipF6S2+CnmiqOhwJ62WfWUkmZk4zU3rzkWCEH9P/bT1hUz27apmO << x-amz-request-id: 8706AE3BE8597644 << Date: Fri, 12 Jun 2015 02:11:23 GMT << ETag: "6c53debeb28f1d12f7ad388b27c9036d" << Content-Length: 0 << Server: AmazonS3  >> GET /content-type-test HTTP/1.1 >> Host: raniz-playground.s3.amazonaws.com >> Authorization: AWS <nope> >> User-Agent: aws-sdk-java/1.9.6 Linux/3.2.0-84-generic Java_HotSpot(TM)_64-Bit_Server_VM/25.45-b02/1.8.0_45 >> Date: Fri, 12 Jun 2015 02:11:23 GMT >> Content-Type: application/x-www-form-urlencoded; charset=utf-8 >> Connection: Keep-Alive << HTTP/1.1 200 OK << x-amz-id-2: 9U1CQ8yIYBKYyadKi4syaAsr+7BV76Q+5UAGj2w1zDiPC2qZN0NzUCQNv6pWGu7n << x-amz-request-id: 6777433366DB6436 << Date: Fri, 12 Jun 2015 02:11:24 GMT << Last-Modified: Fri, 12 Jun 2015 02:11:23 GMT << ETag: "6c53debeb28f1d12f7ad388b27c9036d" << Accept-Ranges: bytes << Content-Type: text/html << Content-Length: 1024 << Server: AmazonS3 

And this is also the behaviour that looking at the source code shows us - if you set the content type the SDK won't override it.

like image 139
Raniz Avatar answered Sep 28 '22 03:09

Raniz