Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do most browsers make multiple HTTP Requests when displaying a PDF from within the browser

Tags:

http

pdf

Do most (IE, FF, Safari, Chrome, Opera) make multiple HTTP Requests for a PDF file when displaying the PDF in a browser? I am working on an issue integrating with WebTrends Web Analytics software, and the statistics around PDFs appear to be incorrect. Support told me that because WebTrends parses the Web Servers access logs to determine traffic, downloads, etc. it has a difficult time determining accurate PDF downloads because:
When a user clicks on a PDF and the PDF opens in the user's browser via the Acrobat Reader browser plug-in, each page is downloaded one-at-a-time -- it does this to conserve bandwidth, if a user only views the first 2 pages of a 50 page PDF, only the first 2 pages are downloaded.

This sounds fishy to me (how could a HTTP Request be made to only serve out a portion of a binary file?) -- I've been searching Google, but haven't found anything that speaks to this.

I will try to find some IE software that lets me sniff the HTTP traffic tomorrow to see if i can observe this phenomenon.

Any info/thoughts are appreciated though.

like image 532
empire29 Avatar asked Nov 30 '09 03:11

empire29


People also ask

Do PDFs Always open in browser?

Click Internet in the left panel of the Preferences menu and then select Internet Settings. Select the Programs tab. Click Manage Add-Ons and choose Acrobat Reader in the list of add-ons. Click Disable to ensure PDFs won't be opened in a browser.

Why do some PDFs open in browser and some download?

This is due to the HTTP Content-Disposition header specifying that the file is an attachment. This instructs the browser to download the file, rather than to open it directly.

How do I stop Chrome from downloading PDF as HTML?

Chrome > Settings > Privacy & Security > Site Settings > Additional Content Settings > PDF Documents > UNCHECK Download PDF files instead of automatically opening them in Chrome.


2 Answers

If your site returns an HTTP response header like this:

Accept-Ranges: bytes

the PDF reader will close the intitial connection after reading just a few KB of the document. It then requests sections of the document as required with the Range request header, e.g.:

Range: bytes=242107-244329, 8060-76128

An example of a URL that does this is http://www.ovationguitars.com/img/OVmanual.pdf .

If you don't return the Accept-Ranges header then the PDF document will be downloaded in a single request (e.g. http://manuals.info.apple.com/en/iphone_user_guide.pdf )

You can see the behavior of the PDF reader in IE using HttpWatch.

** Disclaimer: This answer was posted by Simtec Limited, the makers of HttpWatch **

like image 110
HttpWatchSupport Avatar answered Sep 20 '22 23:09

HttpWatchSupport


For me as of June 2016, Firefox and IE11 only make one call.

Chrome makes two calls if there is no Content-Disposition header. When it is missing, Chrome does two GETs, seems to cancel the second, and shows the PDF in the browser. The server does not know that the second is cancelled, and sends out the PDF again.

When this header is sent from the server, Chrome only makes one call and launches or saves the file.

Content-Disposition: attachment

(You can also suggest the file name to be used when the user saves the file...)

Content-Disposition: attachment; filename=test.pdf
like image 45
Glen Little Avatar answered Sep 18 '22 23:09

Glen Little