I'm using curl_multi functions to request multiple URLs and process them as they complete. As one connection completes all I really have is the cURL handle (and associated data) from curl_multi_info_read()
.
The URLs come from a job queue, and once processed I need to remove the job from the queue. I don't want to rely on the URL to identify the job (there shouldn't be duplicate URLs, but what if there is).
The solution I've worked up so far is to use the cURL handle as an array key pointing to the jobid. Form what I can tell, when treated as a string the handle is something like:
"Resource id #1"
That seams reasonably unique to me. The basic code is:
$ch = curl_init($job->getUrl());
$handles[$ch] = $job;
//then later
$done = curl_multi_info_read($master);
$handles[$done['handle']]->delete();
curl_multi_remove_handle($master, $done['handle']);
Is the cURL handle safe to use in this way?
Or is there a better way to map the cURL handles to the job that created them?
Store private data inside the cURL easy handle, e.g. some job ID:
curl_setopt($ch, CURLOPT_PRIVATE, $job->getId());
// then later
$id = curl_getinfo($done['handle'], CURLINFO_PRIVATE);
This "private data" feature is not (yet) documented in the PHP manual. It was introduced already in PHP 5.2.4. It allows you to store and retrieve a string of your choice inside the cURL handle. Use it for a key that uniquely identifies the job.
Edit: Feature is now documented in the PHP manual (search for CURLOPT_PRIVATE
within the page).
It will probably work thanks to some implicit type cast, but it doesn't feel right to me at all. I think it's begging for trouble somewhere down the line, with future versions that treat resources differently, different platforms...
I personally wouldn't do it, but use numeric indexes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With