Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cUrl - getting the html response body

Tags:

php

curl

I'm sure this is fairly simple. I'm using the function below to retrieve sites raw html in order to parse it. during my testing, I decided to run my code on stackoverflow.com

Instead of getting the html response the Chrome is printing out the actual site rather then assigning the html to its veritable. What am I missing?

function get_site_html($site_url) 
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_COOKIESESSION, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
    curl_setopt($ch, CURLOPT_FORBID_REUSE, true);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
    curl_setopt($ch, CURLOPT_URL, $site_url);

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $response = curl_exec($ch);

    global $base_url; 
    $base_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
    $http_response_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    curl_close ($ch);
    return $response;
}

The site raw html should be assigned to $response, and then return it.

like image 272
elad.chen Avatar asked Feb 16 '23 05:02

elad.chen


1 Answers

Your code works. Try echo htmlentities($response); You'll get the raw html for the site you're curling.

like image 197
labue Avatar answered Feb 24 '23 14:02

labue