Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Canvas tainted by CORS data and S3

My application is displaying images stored in AWS S3 (in a private bucket for security reasons).

To allow users to see the images from their browser I generate signed URLs like https://s3.eu-central-1.amazonaws.com/my.bucket/stuff/images/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...&X-Amz-Date=20170701T195504Z&X-Amz-Expires=900&X-Amz-Signature=bbe277...3358e8&X-Amz-SignedHeaders=host.
This is working perfectly with <img src="S3URL" />: the images are correctly displayed.
I can even directly view the images in another tab by copy/pasting their URL.

I'm also generating PDFs embedding these images which need to be transformed before with a canvas: resized and watermarked.

But the library I use for resizing is having some troubles:

Failed to execute 'getImageData' on 'CanvasRenderingContext2D':
The canvas has been tainted by cross-origin data.

Indeed we are in a CORS context but I've setup everything so that the images can be displayed to the user and indeed it's working.
So I'm not sure to understand the reason of this error: is this another CORS security layer: the browser fears that I might change the image in a malicious purpose?

I've tried to set a permissive CORS configuration on the S3 bucket:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <AllowedMethod>POST</AllowedMethod>
        <AllowedMethod>PUT</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <AllowedHeader>*</AllowedHeader>
    </CORSRule>
</CORSConfiguration>

And img.crossOrigin = "" or img.crossOrigin = "Anonymous" on the client-side but then I get:

Access to Image at 'https://s3.eu-central-1.amazonaws.com/...'
from origin 'http://localhost:5000' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.
Origin 'http://localhost:5000' is therefore not allowed access.

Which AWS/S3-side and/or client-side configuration could be missing?

like image 846
Pragmateek Avatar asked Jul 01 '17 20:07

Pragmateek


People also ask

Does S3 allow CORS?

You can apply Cross-Origin Resource Sharing (CORS) rules to your bucket using either the Amazon S3 console or AWS Command Line Interface (AWS CLI).

How does S3 CORS work?

Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain. With CORS support, you can build rich client-side web applications with Amazon S3 and selectively allow cross-origin access to your Amazon S3 resources.

What is a tainted canvas?

A tainted canvas is one which is no longer considered secure, and any attempts to retrieve image data back from the canvas will cause an exception to be thrown.

How do I update my AWS S3 CORS?

In the Amazon S3 console, choose the bucket you want to edit. Select the Permissions tab, and scoll down to the Cross-origin resource sharing (CORS) panel. Choose Edit, and type your CORS configuration in the CORS Configuration Editor, then choose Save.


2 Answers

One workaround here is to prevent the browser from caching the downloaded object. This seems to stem from arguably incorrect behavior on the part of S3 that interacts with the way Chrome handles cached objects. I recently answered a similar question on Server Fault, and you can find additional details, there.

The problem seems to arise when you fetch an object from S3 from simple HTML (like an <img> tag) and then fetch the same object again in a cross-origin context.

Chrome caches the result of the first request, and then uses that cached response instead of making a new request the second time. When it examines the cached object, there's no Access-Control-Allow-Origin header, because it was cached from a request that wasn't subject to CORS rules... so when that first request was made, the browser didn't send an Origin header. Because of that, S3 didn't respond with an Access-Control-Allow-Origin header (or any CORS-related headers).

The root of the problem seems related to the HTTP Vary: response header, which is related to caching.

A web server (S3 in this case) can use the Vary: response header to signal to the browser that the server is capable of producing more than one representation of the object being returned -- and if the browser would vary an attribute of the request, the response might differ. When the browser is considering using a cached object, it should check whether the object is valid in the new context, before concluding that the cached object is suited to the current need.

Indeed, when you send an Origin request header to S3, you get a response that includes Vary: Origin. This tells the browser that if the origin sent in the request had been a different value, the response might also have been different -- for example, because not all origins might be allowed.

The first part of the underlying problem is that S3 -- arguably -- should always return Vary: Origin whenever CORS is configured on the bucket, even if the browser didn't send an origin header, because Vary can be specified against a header you didn't actually include in the request, to tell you that if you had included it, the response might have differed. But, it doesn't do that, when Origin isn't present.

The second part of the problem is that Chrome -- when it consults its internal cache -- sees that it already has a copy of the object. The response that seeded the cache did not include Vary, so Chrome assumes this object is also perfectly valid for the CORS request. Clearly, it isn't, since when Chrome tries to use the object, it finds that the cross-origin response headers are missing. Presumably, had Chrome received a Vary: Origin response from S3 on the original request, it would have realized that its provisional request headers for the second request included Origin:, so it would correctly go and fetch a different copy of the object. If it did that, the problem would go away -- as we have illustrated by setting Cache-Control: no-cache on the object, preventing Chrome from caching it. But, it doesn't.

So, we work around this by setting Cache-Control: no-cache on the object in S3, so that Chrome won't cache the first one, and will make the correct CORS request for the second one, instead of trying to use the cached copy, which will fail.

Note that if you want to avoid updating your objects in S3 to include the Cache-Control: no-cache response, there is another option for solving this without actually adding the header to the objects at rest in S3. Actually, there are two more options:

The S3 API respects a value passed in the query string of response-cache-control=no-cache. Adding this to the signed URL will direct S3 to add the header to the response, regardless of the Cache-Control metadata value stored with the object (or lack thereof). You can't simply append this to the query string -- you have to add it as part of the URL signing process. But once you add that to your code, your objects will be returned with Cache-Control: no-cache in the response headers.

Or, if you can generate these two signed URLs for the same object separately when rendering the page, simply change the expiration time on one of the signed URLs, relative to the other. Make it one minute longer, or something along those lines. Changing the expiration time from one to the other will force the two signed URLs to be different, and two different objects with two different query strings should be interpreted by Chrome as two separate objects, which should also eliminate the incorrect usage of the first cached object to serve the other request.

like image 69
Michael - sqlbot Avatar answered Sep 21 '22 08:09

Michael - sqlbot


I already have allow any origin on S3 even though I am only fetching from exactly one origin, yet this problem continued so I don't see how this can actually be a CORS problem. As others have said, it is mostly a browser bug. The problem appears if you have to fetch the image from an tag AND at any point later also do a javascript fetch for it...you'll get the cache bug. The easiest solution for me was to put query parameters at the end of the URL that would be used by javascript.

https://something.s3.amazonaws.com/something.jpg?stopGivingMeHeadaches=true

like image 45
user2565663 Avatar answered Sep 22 '22 08:09

user2565663