I'm using the AWS 2.3.2 SDK for PHP to try to pull down a large file (~4g) from S3 using their stream wrapper, which should allow me to use fopen / fwrite to write the file to disk and not buffer into memory.
Here is the reference:
http://docs.aws.amazon.com/aws-sdk-php-2/guide/latest/service-s3.html#downloading-data
Here is my code:
public function download()
{
$client = S3Client::factory(array(
'key' => getenv('S3_KEY'),
'secret' => getenv('S3_SECRET')
));
$bucket = getenv('S3_BUCKET');
$client->registerStreamWrapper();
try {
error_log("calling download");
// Open a stream in read-only mode
if ($stream = fopen('s3://'.$bucket.'/tmp/'.$this->getOwner()->filename, 'r')) {
// While the stream is still open
if (($fp = @fopen($this->getOwner()->path . '/' . $this->getOwner()->filename, 'w')) !== false){
while (!feof($stream)) {
// Read 1024 bytes from the stream
fwrite($fp, fread($stream, 1024));
}
fclose($fp);
}
// Be sure to close the stream resource when you're done with it
fclose($stream);
}
The file downloads but I continually get error messages from Heroku:
2013-08-22T19:57:59.537740+00:00 heroku[run.9336]: Process running mem=515M(100.6%) 2013-08-22T19:57:59.537972+00:00 heroku[run.9336]: Error R14 (Memory quota exceeded)
Which leads me to believe this is still buffering to memory somehow. I've tried to use https://github.com/arnaud-lb/php-memory-profiler, but got a Seg Fault.
I also tried to download the file using cURL with CURLOPT_FILE option to write directly to the disk and i'm still running out of memory. The odd thing is according to top
my php instance is using 223m of memory so not even half of the allowed 512.
Anybody have any ideas? I'm running this from php 5.4.17 cli to test.
Did you already try with a 2x dyno, those have 1GB of memory?
What you also can try is downloading the file by executing a curl command in PHP. It's not the cleanest way but it will be much faster/more reliable and memory friendly.
exec("curl -O http://test.s3.amazonaws.com/file.zip", $output);
This example is for a public URL. If you don't want to make your S3 files public you can always create a signed URL and use that in combination with the curl command.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With