Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse a LinkedIn page

Tags:

May somebody help my how should I parse this link by curl?

https://www.linkedin.com/in/williamhgates/

And this is my codes:

Just run it and look at the result:

$url = "https://www.linkedin.com/in/williamhgates/";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: www.linkedin.com/in/williamhgates/'));
$output = curl_exec($ch);
curl_close($ch);dd($output);die;

I just want to get whole of the source in a file but it shows:

Could not process this client request HTTP method request for URL
like image 975
Hamed Avatar asked Feb 15 '17 09:02

Hamed


1 Answers

Linked in does not allow all crawlers except a few which send them traffic (Google bot, Bingbot, etc), They have explicitly blocked the User-agent. So it is not possible to make curl request to Linkedin page. But still anyhow you manage to crawl Linkedin, it could create legal issues for you. Hence, it’s better to leave Linkedin alone.

like image 155
Shiva Avatar answered Sep 21 '22 10:09

Shiva