Is it possible to add a key to s3 with an utf-8 encoded name like "åøæ.jpg"?
I'm getting the following error when uploading with boto:
<Error><Code>InvalidURI</Code><Message>Couldn't parse the specified URI.</Message>
@2083: This is a bit of an old question, but if you haven't found the solution, and for everyone else that comes here like me looking for an answer:
From the official documentation (http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html):
Although you can use any UTF-8 characters in an object key name, the following key naming best practices help ensure maximum compatibility with other applications. Each application may parse special characters differently. The following guidelines help you maximize compliance with DNS, web safe characters, XML parsers, and other APIs.
Safe Characters
The following character sets are generally safe for use in key names:
Alphanumeric characters [0-9a-zA-Z]
Special characters !, -, _, ., *, ', (, and )
The following are examples of valid object key names:
4my-organization
my.great_photos-2014/jan/myvacation.jpg
videos/2014/birthday/video1.wmv
However, if what you really want, like me, is a filename that allows UTF-8 characters (note that this can be different from the key name). You have a way to do it!
From http://www.bennadel.com/blog/2591-embedding-foreign-characters-in-your-content-disposition-filename-header.htm and http://www.bennadel.com/blog/2696-overriding-content-type-and-content-disposition-headers-in-amazon-s3-pre-signed-urls.htm (Kudos to Ben Nadal) you can do that by making sure that when downloading the file, S3 will override the Content-Disposition header.
As I have done it in java, I include here the code, I'm sure you'll be able to easily translate it to Python :) :
      AmazonS3 s3 = S3Controller.getS3Client();
        //as per http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html
        String key = fileName.substring(fileName.indexOf("-")).replaceAll("[^a-zA-Z0-9._]", "");
        PutObjectRequest putObjectRequest = new PutObjectRequest(
                S3Controller.bucketNameForBucket(S3Controller.Bucket.EXPORT_BUCKET), 
                key,
                file);
        // we can always regenerate these files, so we can used reduced redundancy storage
        putObjectRequest.setStorageClass(StorageClass.Standard);
        String urlEncodedUTF8Filename = key;
        try {
            //http://www.bennadel.com/blog/2696-overriding-content-type-and-content-disposition-headers-in-amazon-s3-pre-signed-urls.htm
            //http://www.bennadel.com/blog/2591-embedding-foreign-characters-in-your-content-disposition-filename-header.htm
            //Issue#179
            urlEncodedUTF8Filename = URLEncoder.encode(fileName.substring(fileName.indexOf("-")), "UTF-8");
        } catch (UnsupportedEncodingException e) {
            LOG.warn("Could not URLEncode a filename. Original Filename: " + fileName, e );
        }
        ObjectMetadata metadata = new ObjectMetadata();
        metadata.setContentDisposition("attachment; filename=\"" + key + "\"; filename*=UTF-8''"+ urlEncodedUTF8Filename);
        putObjectRequest.setMetadata(metadata);
        s3.putObject(putObjectRequest);
It should help :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With