Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to parse JSON with Goutte?

I'm working on crawling web sites and there is no problem for parsing HTML with Goutte so far. But I need to retrieve JSON from a web site and because of the cookie management, I don't want to do this with file_get_contents() - that doesn't work.

I can do with pure cURL but in this case I just want to use Goutte and don't want to use any other library.

So is there any method that I can parse only text via Goutte or do I really have to do this with good old methods?

/* Sample Code */
$client = new Client();
$crawler = $client->request('foo');
$crawler = $crawler->filter('bar'); // of course not working

Thank you.

like image 231
mithataydogmus Avatar asked Nov 30 '22 03:11

mithataydogmus


1 Answers

After very deep search inside Goutte libraries I found a way and I wanted to share. Because Goutte is really powerful library but there are so complicated documentation.

Parsing JSON via (Goutte > Guzzle)

Just get needed output page and store json into an array.

$client = new Client(); // Goutte Client
$request = $client->getClient()->createRequest('GET', 'http://***.json');   
/* getClient() for taking Guzzle Client */

$response = $request->send(); // Send created request to server
$data = $response->json(); // Returns PHP Array

Parsing JSON with Cookies via (Goutte + Guzzle) - For authentication

Send request one of the page of the site (main page looks better) to get cookies and then use these cookies for authentication.

$client = new Client(); // Goutte Client
$crawler = $client->request("GET", "http://foo.bar");
/* Send request directly and get whole data. It includes cookies from server and 
it automatically stored in Goutte Client object */

$request = $client->getClient()->createRequest('GET', 'http://foo.bar/baz.json');
/* getClient() for taking Guzzle Client */

$cookies = $client->getRequest()->getCookies();
foreach ($cookies as $key => $value) {
   $request->addCookie($key, $value);
}

/* Get cookies from Goutte Client and add to cookies in Guzzle request */

$response = $request->send(); // Send created request to server
$data = $response->json(); // Returns PHP Array

I hope it helps. Because I almost spend 3 days to understand Gouttle and it's components.

like image 183
mithataydogmus Avatar answered Dec 29 '22 07:12

mithataydogmus