I am trying to find out file size of an url:
$url1 = 'www.google.com';
$curl1 = curl_init();
curl_setopt($curl1, CURLOPT_URL, $url1);
curl_setopt($curl1, CURLOPT_RETURNTRANSFER, TRUE);
curl_exec($curl1);
$file_size = curl_getinfo($curl1, CURLINFO_SIZE_DOWNLOAD );
$file_size_kb = $file_size / 1000;
echo $file_size_kb;
The output is 43331
. I think its too low because i have google cached. Can this be true? Also i tested google on some other site that calculates size of url, and it was twice as big.
When we use the cURL command, we must note that cURL is only an HTTP client, and it doesn't cache any request on the client-side. Therefore, any caching while using this command happens on the server-side. To bypass the server-side cache, we can try some tweaks on the HTTP request we're sending.
libcurl uses its DNS cache by default as long as you re-use the handle. You can change the time it'll hold entries in the cache - it is only meant to aid "spikes" or rapid requests to the same host names as it doesn't get the "true" TTL values.
To check whether the Curl package is installed on your system, open up your console, type curl , and press enter. If you have curl installed, the system will print curl: try 'curl --help' or 'curl --manual' for more information . Otherwise, you will see something like curl command not found .
You can use CURLOPT_FRESH_CONNECT
for this. From curl_setopt
CURLOPT_FRESH_CONNECT TRUE to force the use of a new connection instead of a cached one.
curl_setopt($curl1, CURLOPT_FRESH_CONNECT, TRUE);
According to RFC 7234 - Hypertext Transfer Protocol (HTTP/1.1): Caching and 5.2. Cache-Control
The "Cache-Control" header field is used to specify directives for caches along the request/response chain.
5.2.1. Request Cache-Control Directives defines several directives to control the use of caches for a response. One of these is
5.2.1.4. no-cache
The "no-cache" request directive indicates that a cache MUST NOT use a stored response to satisfy the request without successful validation on the origin server.
So setting an appropriate header with
curl_setopt($curl1, CURLOPT_HTTPHEADER, array("Cache-Control: no-cache"));
should ensure, that a valid and up to date response will be returned. I understand, that this may still result in a cached response, if the validation on the server allows to do so.
However, 5.2.2.1. must-revalidate is a Response Cache-Control Directive given by a server together with the response to a request
[...] The must-revalidate directive ought to be used by servers if and only if failure to validate a request on the representation could result in incorrect operation, such as a silently unexecuted financial transaction.
curl_setopt($curl1, CURLOPT_FRESH_CONNECT, 1); // don't use a cached version of the url
CURLOPT_FRESH_CONNECT TRUE to force use of a new connection instead of a cached one.
check example here
you can set header
$headers = array(
"Cache-Control: no-cache",
);
curl_setopt($curl1, CURLOPT_HTTPHEADER, $headers);
this link may be helpful to you http://www.php.net/manual/en/function.curl-setopt.php#96903
The best way to avoid caching is applying the time or any other random element to the url, like this:$url .= '?ts=' . time();
so for example instead of havinghttp://example.com/content.php
you would havehttp://example.com/content.php?ts=1212434353
You can tell CURL
to use fresh data by setting CURLOPT_FRESH_CONNECT
to TRUE
You can read more about CURL
function here :
http://php.net/manual/en/function.curl-setopt.php
Use CURLOPT_FRESH_CONNECT - TRUE to force the use of a new connection instead of a cached one.
Example:
<?php
function check_url($url) {
$c = curl_init();
curl_setopt($c, CURLOPT_URL, $url);
curl_setopt($c, CURLOPT_HEADER, 1); // get the header
curl_setopt($c, CURLOPT_NOBODY, 1); // and *only* get the header
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); // get the response as a string from curl_exec(), rather than echoing it
curl_setopt($c, CURLOPT_FRESH_CONNECT, 1); // don't use a cached version of the url
if (!curl_exec($c)) { return false; }
$httpcode = curl_getinfo($c, CURLINFO_HTTP_CODE);
return ($httpcode < 400);
}
?>
for more details about curl check out http://php.net/manual/en/function.curl-setopt.php
may this help you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With