Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File metadata not kept in S3 after a CLI copy

I have two buckets (let‘s say A and B) in different regions. I have archives in the bucket A with a metadata (x-amz-meta-mymeta). I do perform a copy command using the CLI:

aws s3 cp s3://A/${file}.tar.gz s3://B/

Depending on the file, metadata are kept or not. Files pikachu have both the same metadata after the copy, but a file pika-chu does not have the metadata in the target bucket B.

I have read the doc but could not find any other information than the one exposed in this SO answer.

The metadata is lost even though the destination file already exists or not.

Any hints on that?

Edit: metadata is lost even if the copy command is

aws s3 cp s3://A/${file}.tar.gz s3://B/${file}

Edit: For info, the size of the files differs, pikachu is few Mb large whereas pika-chu is more like 50Mb.

Edit: Files are uploaded using aws s3 cp with no multipart information.

like image 755
greg Avatar asked Nov 17 '17 09:11

greg


1 Answers

Here is the official documentation on the PUT COPY REST API.

Look into the x-amz-metadata-directive.

REST Object Copy

For the CLI add (valid only for files smaller than 5 GB and no multipart copy / upload - multipart copy does not preserve all headers):

--metadata-directive COPY

--metadata-directive (string) Specifies whether the metadata is copied from the source object or replaced with metadata provided when copying S3 objects. Note that if the object is copied over in parts, the source object's metadata will not be copied over, no matter the value for --metadata-directive, and instead the desired metadata values must be specified as parameters on the command line. Valid values are COPY and REPLACE. If this parameter is not specified, COPY will be used by default. If REPLACE is used, the copied object will only have the metadata values that were specified by the CLI command. Note that if you are using any of the following parameters: --content-type, content-language, --content-encoding, --content-disposition, --cache-control, or --expires, you will need to specify --metadata-directive REPLACE for non-multipart copies if you want the copied objects to have the specified metadata values.

[EDIT after question]

The AWS CLI will automatically use multipart upload. This is configurable. This link shows the S3 configuration options.

AWS CLI S3 Configuration

like image 56
John Hanley Avatar answered Sep 18 '22 22:09

John Hanley