Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Heroku Memory Error with PHP and reading large file from S3

I'm using the AWS 2.3.2 SDK for PHP to try to pull down a large file (~4g) from S3 using their stream wrapper, which should allow me to use fopen / fwrite to write the file to disk and not buffer into memory.

Here is the reference:

http://docs.aws.amazon.com/aws-sdk-php-2/guide/latest/service-s3.html#downloading-data

Here is my code:

public function download()
    {

        $client = S3Client::factory(array(
                    'key'    => getenv('S3_KEY'),
                    'secret' => getenv('S3_SECRET')
                    ));

        $bucket = getenv('S3_BUCKET');
        $client->registerStreamWrapper();

        try {
            error_log("calling download");
            // Open a stream in read-only mode
            if ($stream = fopen('s3://'.$bucket.'/tmp/'.$this->getOwner()->filename, 'r')) {
                // While the stream is still open
                if (($fp = @fopen($this->getOwner()->path . '/' . $this->getOwner()->filename, 'w')) !== false){

                    while (!feof($stream)) {
                        // Read 1024 bytes from the stream
                        fwrite($fp, fread($stream, 1024));
                    }
                    fclose($fp);
                }
            // Be sure to close the stream resource when you're done with it
            fclose($stream);
        }

The file downloads but I continually get error messages from Heroku:

2013-08-22T19:57:59.537740+00:00 heroku[run.9336]: Process running mem=515M(100.6%) 2013-08-22T19:57:59.537972+00:00 heroku[run.9336]: Error R14 (Memory quota exceeded)

Which leads me to believe this is still buffering to memory somehow. I've tried to use https://github.com/arnaud-lb/php-memory-profiler, but got a Seg Fault.

I also tried to download the file using cURL with CURLOPT_FILE option to write directly to the disk and i'm still running out of memory. The odd thing is according to top my php instance is using 223m of memory so not even half of the allowed 512.

Anybody have any ideas? I'm running this from php 5.4.17 cli to test.

like image 335
bonez Avatar asked Aug 22 '13 20:08

bonez


1 Answers

Did you already try with a 2x dyno, those have 1GB of memory?

What you also can try is downloading the file by executing a curl command in PHP. It's not the cleanest way but it will be much faster/more reliable and memory friendly.

exec("curl -O http://test.s3.amazonaws.com/file.zip", $output);

This example is for a public URL. If you don't want to make your S3 files public you can always create a signed URL and use that in combination with the curl command.

like image 184
Wim Mostmans Avatar answered Oct 25 '22 17:10

Wim Mostmans