Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cookiejar in PHP Curl

Tags:

php

curl

cookies

In PHP Curl case when we need to store/read cookies in term of web scraping, it feels that many resources out there encourage to use a file for handling cookies with these option

curl_setopt($ch, CURLOPT_COOKIEJAR, $CookieJarFilename);

curl_setopt($ch, CURLOPT_COOKIEFILE, $CookieJarFilename);

The bottom line here is they use a single file as cookiejar (usually .txt file).

But in the real scenario, our website is not only accessed by one computer, most likely there are many computers accessed it in the same time, and also there are some bots like Googlebots, Yahoo Slurp, etc.

So, with the single .txt file, isn't it obvious that the cookie jar will overwrite the same text file, make it a real mess for cookie?

Or am I mistaken here?

What's the 'right' method for handling cookies?

like image 348
bagz_man Avatar asked Jan 03 '14 05:01

bagz_man


1 Answers

If there are multiple people accessing your page, and you need to perform curl with unique cookies for everyone, then there are several things you can do to handle this scenario.

1) If your user is authenticated and has a $_SESSION started on your end, then you can use the session_id() for cookie's file name.

2) If your user doesn't require any session(a Google bot, for example), you can create the cookie using timestamp + an extra random number for your cookie file name. For example:

$cookieName = time()."_".substr(md5(microtime()),0,5).".txt"; 
// Would output something like:
// `1388788940_91ab4.txt`

But in this case, you can not reuse the cookie if the user returns back to you 5 minutes later(unless you set the user's cookie with your cookie file name).

For either case, make sure you are cleaning these files periodically. Otherwise you'll have tons of cookie files created in your directory.

like image 200
Sabuj Hassan Avatar answered Oct 23 '22 23:10

Sabuj Hassan