I'm using Guzzle (http://guzzlephp.org) to GET a large number of urls (~300k) . The urls are retrieved from an Elastic Search instance, and I would like to keep adding urls to a Pool so the Pool stays rather small instead of adding them all at once.
Is this possible? I looked at the Pool.php, but did not find a way to do this. Is there a way?
Use while and generator (yield).
$client = new GuzzleHttp\Client();
$client = new Client();
$requests = function () {
$uris = ['http://base_url'];
$visited_uris = []; // maybe database instead of array
while(len($uris)>0)
yield new Request('GET', array_pop($uris));
}
};
$pool = new Pool($client, $requests(), [
'concurrency' => 5,
'fulfilled' => function ($response, $index) {
$new_uri = get_new_uri(); // implement function to get new $uri
if(in_array($new_uri, $visited_uris)) {
array_push($uris, $uri);
}
array_push($visited_uris, $uri);
}
]);
$promise = $pool->promise();
$promise->wait();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With