Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download multiple images from remote server with PHP (a LOT of images)

I am trying to download lots of files from an external server (approx. 3700 images). These images go from 30KB to 200KB each.

When I use the copy() function on 1 image, it works. When I use it in a loop, all I get are 30B images (empty images files).

I tried using copy, cURL, wget, and file_get_contents. Every time, I either get a lot of empty files, or nothing at all.

Here are the codes I tried:

wget:

exec('wget http://mediaserver.centris.ca/media.ashx?id=ADD4B9DD110633DDDB2C5A2D10&t=pi&f=I -O SIA/8605283.jpg');

copy:

if(copy($donnees['PhotoURL'], $filetocheck)) {
  echo 'Photo '.$filetocheck.' updated<br/>';
}

cURL:

$ch = curl_init();
$source = $data[PhotoURL];
curl_setopt($ch, CURLOPT_URL, $source);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec ($ch);
curl_close ($ch);

$destination = $newfile;
$file = fopen($destination, "w+");
fputs($file, $data);
fclose($file);

Nothing seems to be working properly. Unfortunately, I don't have much choice to download all these files at once, and I need a way to make it work as soon as possible.

Thanks a lot, Antoine

like image 210
Antoine Bouchard Avatar asked Mar 15 '13 15:03

Antoine Bouchard


1 Answers

Getting them one by one might be quite slow. Consider splitting them into packs of 20-50 images and grabbing them with multiple threads. Here's the code to get you started:

$chs = array();
$cmh = curl_multi_init();
for ($t = 0; $t < $tc; $t++)
{
    $chs[$t] = curl_init();
    curl_setopt($chs[$t], CURLOPT_URL, $targets[$t]);
    curl_setopt($chs[$t], CURLOPT_RETURNTRANSFER, 1);
    curl_multi_add_handle($cmh, $chs[$t]);    
}

$running=null;
do {
    curl_multi_exec($cmh, $running);
} while ($running > 0);

for ($t = 0; $t < $tc; $t++)
{
    $path_to_file = 'your logic for file path';
    file_put_contents($path_to_file, curl_multi_getcontent($chs[$t]));
    curl_multi_remove_handle($cmh, $chs[$t]);
    curl_close($chs[$t]);
}
curl_multi_close($cmh);

I used that approach to grab a few millions of images recently, since one by one would take up to a month.

The amount of images you grab at once should depend on their expected size and your memory limits.

like image 174
Ranty Avatar answered Nov 02 '22 22:11

Ranty