We've been experiencing very strange behaviour in our current infrastructure setup:
Using wget, you see this output on the affected machines for a file we uploaded:
--2014-07-31 16:33:38-- http://s3-eu-west-1.amazonaws.com/not_the_real_file_url
Resolving s3-eu-west-1.amazonaws.com (s3-eu-west-1.amazonaws.com)... 178.236.6.160
Connecting to s3-eu-west-1.amazonaws.com (s3-eu-west-1.amazonaws.com)|178.236.6.160|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2801149 (2.7M) [text/plain]
Saving to: `/dev/null'
0% [ ] 10,111 1.05K/s eta 68m 26s
and it stays like this for 68m! (it does finish the download after that time though)
And this output for a random file hosted on amazon s3 by somebody else:
--2014-07-31 16:39:21-- https://s3.amazonaws.com/Minecraft.Download/versions/14w31a/minecraft_server.14w31a.jar
Resolving s3.amazonaws.com (s3.amazonaws.com)... 72.21.211.199
Connecting to s3.amazonaws.com (s3.amazonaws.com)|72.21.211.199|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10342238 (9.9M) [application/octet-stream]
Saving to: `/dev/null'
32% [====================================> ] 3,370,945 747K/s eta 12s
Our current solution, is to use our HAProxy as a transparent HTTP proxy.
Meaning we have a frontend "cloud.example.com" defined, and a backend that first replaces the requests HOST with "s3-eu-west-1.amazonaws.com" and then uses s3-eu-west-1.amazonaws.com:80 as a server. For amazon it then looks like the request is coming from our proxy and we can download the files we stored on S3 thousands of times again. :)
[2014-07-31 16:56:57 +0200] RUN[28] AVG: '0.9612743812142854' s, LAST_RUN: '0.711118431' s
--2014-07-31 16:56:57-- https://cloud.example.com/not_the_real_file_url
Resolving cloud.example.com (cloud.example.com)... 1.2.3.4
Connecting to cloud.example.com (cloud.example.com)|1.2.3.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2801149 (2.7M) [text/plain]
Saving to: `/dev/null'
100%[====================>] 2,801,149 2.47M/s in 1.1s
If you want to speed up S3 uploads or downloads on smaller files, try Cloudfront instead. Make sure only to use the S3 accelerate endpoint selectively to avoid extra costs on data transfers you may not care to speed up.
Amazon S3 Transfer Acceleration – This new feature accelerates Amazon S3 data transfers by making use of optimized network protocols and the AWS edge infrastructure. Improvements are typically in the range of 50% to 500% for cross-country transfer of larger objects, but can go ever higher under certain conditions.
To maximize file transfer speeds to S3, we recommend that you install FileCatalyst server as close (geographically) as possible to the region and Availability Zone (AZ) where your S3 bucket resides. This ensures the HTTP communication between FileCatalyst server and S3 is as fast as possible.
Ok, solved it.
I'm still researching why this solved the issue, but here is what fixed it now:
As I described above, the behaviour occurs on an Ubuntu 12.04.5 KVM-Guest running on an Ubuntu 12.04.4 KVM-Host system. I took a look today, if we use different kernels (linux-image-*) on the guests (which can still happen since we're not provisioning them with puppet yet).
On KVM-guests where we have the strange <5 KB/s S3 download behaviour, we're using:
On KVM-guests with >5 MB/s S3 download speed, we're using:
Hope this helps you if you ran into the same issue. I'll post more, if I truly understand why this happens.
Of course: You should use a *-virtual kernel on a VM-guest, I know. Why only S3 download is slow though kind of confuses me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With