Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to tell curl to not use cache

Tags:

php

curl

I am trying to find out file size of an url:

$url1 = 'www.google.com';
$curl1 = curl_init();
curl_setopt($curl1, CURLOPT_URL, $url1); 
curl_setopt($curl1, CURLOPT_RETURNTRANSFER, TRUE);
curl_exec($curl1);
$file_size = curl_getinfo($curl1, CURLINFO_SIZE_DOWNLOAD ); 
$file_size_kb = $file_size / 1000;
echo $file_size_kb;

The output is 43331. I think its too low because i have google cached. Can this be true? Also i tested google on some other site that calculates size of url, and it was twice as big.

like image 526
Edgar Avatar asked Mar 19 '13 07:03

Edgar


People also ask

Does curl use a cache?

When we use the cURL command, we must note that cURL is only an HTTP client, and it doesn't cache any request on the client-side. Therefore, any caching while using this command happens on the server-side. To bypass the server-side cache, we can try some tweaks on the HTTP request we're sending.

Does curl cache DNS?

libcurl uses its DNS cache by default as long as you re-use the handle. You can change the time it'll hold entries in the cache - it is only meant to aid "spikes" or rapid requests to the same host names as it doesn't get the "true" TTL values.

How do I know if my curl command is working?

To check whether the Curl package is installed on your system, open up your console, type curl , and press enter. If you have curl installed, the system will print curl: try 'curl --help' or 'curl --manual' for more information . Otherwise, you will see something like curl command not found .


5 Answers

You can use CURLOPT_FRESH_CONNECT for this. From curl_setopt

CURLOPT_FRESH_CONNECT TRUE to force the use of a new connection instead of a cached one.

curl_setopt($curl1, CURLOPT_FRESH_CONNECT, TRUE);

According to RFC 7234 - Hypertext Transfer Protocol (HTTP/1.1): Caching and 5.2. Cache-Control

The "Cache-Control" header field is used to specify directives for caches along the request/response chain.

5.2.1. Request Cache-Control Directives defines several directives to control the use of caches for a response. One of these is

5.2.1.4. no-cache

The "no-cache" request directive indicates that a cache MUST NOT use a stored response to satisfy the request without successful validation on the origin server.

So setting an appropriate header with

curl_setopt($curl1, CURLOPT_HTTPHEADER, array("Cache-Control: no-cache"));

should ensure, that a valid and up to date response will be returned. I understand, that this may still result in a cached response, if the validation on the server allows to do so.


However, 5.2.2.1. must-revalidate is a Response Cache-Control Directive given by a server together with the response to a request

[...] The must-revalidate directive ought to be used by servers if and only if failure to validate a request on the representation could result in incorrect operation, such as a silently unexecuted financial transaction.

like image 126
Olaf Dietsche Avatar answered Sep 25 '22 20:09

Olaf Dietsche


curl_setopt($curl1, CURLOPT_FRESH_CONNECT, 1); // don't use a cached version of the url

CURLOPT_FRESH_CONNECT TRUE to force use of a new connection instead of a cached one.

check example here

you can set header

$headers = array( 
                 "Cache-Control: no-cache", 
                ); 
curl_setopt($curl1, CURLOPT_HTTPHEADER, $headers);

this link may be helpful to you http://www.php.net/manual/en/function.curl-setopt.php#96903

like image 44
Subodh Ghulaxe Avatar answered Sep 25 '22 20:09

Subodh Ghulaxe


The best way to avoid caching is applying the time or any other random element to the url, like this:
$url .= '?ts=' . time();

so for example instead of having
http://example.com/content.php
you would have
http://example.com/content.php?ts=1212434353

like image 35
Mike Walkowiak Avatar answered Sep 25 '22 20:09

Mike Walkowiak


You can tell CURL to use fresh data by setting CURLOPT_FRESH_CONNECT to TRUE

You can read more about CURL function here :

http://php.net/manual/en/function.curl-setopt.php

like image 32
Dead Man Avatar answered Sep 22 '22 20:09

Dead Man


Use CURLOPT_FRESH_CONNECT - TRUE to force the use of a new connection instead of a cached one.

Example:

<?php
    function check_url($url) {
        $c = curl_init();
        curl_setopt($c, CURLOPT_URL, $url);
        curl_setopt($c, CURLOPT_HEADER, 1); // get the header
        curl_setopt($c, CURLOPT_NOBODY, 1); // and *only* get the header
        curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); // get the response as a string from curl_exec(), rather than echoing it
        curl_setopt($c, CURLOPT_FRESH_CONNECT, 1); // don't use a cached version of the url
        if (!curl_exec($c)) { return false; }

        $httpcode = curl_getinfo($c, CURLINFO_HTTP_CODE);
        return ($httpcode < 400);
    }
?>

for more details about curl check out http://php.net/manual/en/function.curl-setopt.php

may this help you.

like image 25
Tony Stark Avatar answered Sep 24 '22 20:09

Tony Stark