Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Chrome Cache PDF files

I have built a small PHP/MySQL internal app to host and sort documents. All works perfectly until it comes to updating a file, in this case a .PDF file. When a user updates the .PDF the new file is on the server as expected and the older version deleted. A user is getting the new version providing they never opened the old version.

Now the problem.... If a user has opened the older version of the .PDF at some point in the past they do not get the newer version when a link is clicked to view the document even though its only the new version actually physically on the server.

I'm guessing that Google Chrome Browser is caching the older version of the PDF somewhere. How can I get around this? Due to the amount of users and the number of times a day some of the documents are updated asking users to manually clear any cache is not practical.

like image 441
twigg Avatar asked Oct 07 '13 12:10

twigg


People also ask

Do PDFs get cached?

Indeed, this happens at least in Chrome and Edge browsers. It happens because the pdf file that you pass to the viewer (iframe or PDFObject) is always cached internally with the same name. You can solve it add a random number or date and time as the name of the pdf.

Where are PDFs stored Chrome?

They are stored at this path %USERPROFILE%\AppData\Local\ in Windows.

Why are all my saved PDFs showing as Chrome?

PDF file changes to chrome because you may have selected the chrome browser as your default PDF viewer. You can fix the issue from the file properties of PDF files or from the system's settings.

How do I stop Chrome from hijacking PDF files?

In the “Privacy and Security” section, select “Site Settings“. Select “Additional content settings”. Scroll down and select “PDF documents“. Switch the “Download PDF files instead of automatically opening them in Chrome” to “On“.


2 Answers

You really have three choices here:

  1. Change the filename everytime it gets updated
  2. Always generate the HREF with a GET parameter
  3. Send header information telling the browser to always download fresh from server

Option 1 - Works in 100% of cases. Might be tricky to maintain

echo '<a href="files/pdfs/'.$row['FILENAME_FROM_DATABASE'].'">PDF</a>';

// Could produce something like:
// <a href="files/pdfs/filename_v5.pdf">PDF</a>

Option 2 - Works in 99% of cases

echo '<a href="files/pdfs/filename.pdf?q='.microtime(true).'">PDF</a>';

Option 3 - Works in 99% of cases

header("Pragma: public");
header("Cache-Control: maxage=1"); // <-- important
header('Expires: '.gmdate('D, d M Y H:i:s', time()+1).' GMT');
header('Content-type: application/pdf');
exit(file_get_contents(PATH_TO_PDF_FILE));
like image 179
MonkeyZeus Avatar answered Oct 19 '22 20:10

MonkeyZeus


In HTML5 you can force a browser not to cache for certain domains (or not to cache at all, or use cache if available and so on) - see https://developer.mozilla.org/en-US/docs/HTML/Using_the_application_cache

Add this to your <!doctype html><head> -section :

<html manifest="my.cache">

create a file on your document root - my.cache - containing the following :

CACHE MANIFEST  
CACHE  
# dont force any caching 
NETWORK:
#force downloads form your site not to use cache
your-site.com

This forces that nothing is being cached.

If you have a path to your pdf-download, use that instead (so other files from your site except the PDF's will be cached)

Try this in a browser. Remember to clear the cache first! :) When you will discover that each PDF is downloaded, regardless of filename or headers.

like image 37
davidkonrad Avatar answered Oct 19 '22 18:10

davidkonrad