Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do browsers re-request scripts on non-200 response?

Save the following HTML as a local file. Something like /tmp/foo.html, then open that in Firefox (I'm on 49.0.2)

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
</head>
<body>
<script src="http://localhost:1234/a.js"></script>
<script src="http://localhost:1234/b.js"></script>
<script src="http://localhost:1234/c.js"></script>
<script src="http://localhost:1234/d.js"></script>
<script src="http://localhost:1234/e.js"></script>
</body>
</html>

I don't have a server running on port 1234, so the requests don't even successfully connect.

The behavior I'd expect here is for all the requests to fail, and be done with it.

What actually happens in Firefox is all 5 .js files are requested in parallel, they fail to connect, then the last 4 get re-requested in serial. Like so:

enter image description here

Why?

If I boot a server on 1234 that always 404s, the behaviour is the same.

This particular example doesn't reproduce the same behavior in Chrome, but other similar examples is how I originally fell upon this behavior.

EDIT: Here's how I tested this happens when it 404's as well.

$ cd /tmp
$ mkdir empty
$ cd empty
$ python -m SimpleHTTPServer 1234

Then reloaded Firefox. It shows this:

![enter image description here

The server actually sees all those requests too (the first 5 arrive out of order because they're requested in parallel, but the last 4 are always b, c, d, e, since they get re-requested in serial).

127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /d.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /c.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /b.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /a.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /e.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /b.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /c.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /d.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /e.js HTTP/1.1" 404 -
like image 496
Jamie Wong Avatar asked Nov 02 '16 02:11

Jamie Wong


People also ask

How many HTTP requests do I need to see?

In general, you want to see 200 requests. They are good! Let's talk about how the HTTP protocol works. At its very foundation, the Internet is made up of two core things: clients and servers. Any time you click on your browser, you are accessing the Internet through a web client. It may be Chrome, Firefox, Safari or Internet Explorer.

Why is my website returning a 406 not acceptable error?

Since 406 codes are not as common as 404 codes, the appearance of a 406 could means that the requested URL is valid, but the browser may be misinterpreting the intended request type. Either way, it's a good idea to double-check the exact URL that is returning the 406 Not Acceptable error to make sure it is intended resource.

What language do you use to make HTTP requests?

The language you are using to make these requests is called the HTTP protocol. These protocols are really just standards that everyone on the web has agreed to. Just like English, Spanish and Chinese are all languages that have an understood protocol, HTTP is just a bunch of standards and an understood protocol.

What is the response code for rewriterule 406?

Notice the R=406 flag at the end of the RewriteRule, which explicitly states that the response code should be 406, indicating to user agents that the resource exists, but the explicit Accept- headers could not be fulfilled.


1 Answers

This has to do with edge-cases that could arise with parallel resource loading, where JavaScript is expected to block other resources from loading.

This behavior starts to get more-clear when you add a delay into the error responses. Here is a screenshot of the Firefox network panel with a 1-second delay added to each request.

network panel

As we can see, all 5 scripts were requested in parallel, as modern browser do, to reduce loading times.

However, except for the first one, those scripts that returned a 404 were re-requested, not in parallel but in series. This is almost-certainly to maintain backwards compatibility with some edge-cases with the legacy browser behavior.

Historically, a browser would load and execute one script at a time. Modern browser will load them in parallel, while still maintaining execution order.

So why might this matter?

Imagine if the first script request changed the application state, perhaps setting a cookie or something to authenticate further requests. With the new parallel loading, those scripts would be requested before this state was changed, and assuming the web application is well-enough designed, throw an error.

So the only way to ensure the other resources didn't error because the script did not have a chance to change the state before they were requested is to re-request the resources again.

In fact, this re-requesting behavior is not limited to just scripts, and can also be seen to effect images that error after a script tag that was loaded in parallel.

network panel 2

Potentially, because those images may have failed to load because a prior script did not execute first, they are all re-requested in parallel.

Interestingly, I can't find anything directly about this in the spec, but this section from The Living Standard suggests this behavior may actually violate the spec.

For classic scripts, if the async attribute is present, then the classic script will be fetched in parallel to parsing and evaluated as soon as it is available (potentially before parsing completes). If the async attribute is not present but the defer attribute is present, then the classic script will be fetched in parallel and evaluated when the page has finished parsing. If neither attribute is present, then the script is fetched and evaluated immediately, blocking parsing until these are both complete.

If parsing were actually blocked, then it would seem the following script tags and images should not have been read to be able to load. I suspect that the browsers reconcile this issue by not making the following tags available in the DOM until after execution.

Note:

The exact behavior you will see in these cases may vary a bit. Only those resources that were actually requested in parallel with a script will actually be reloaded. If an image afterwards errors, but it was not requested while a script was loading, then there is no need to re-request it. Additionally, it appears Chrome only triggers this behavior if the potentially-state-changing script does not error, however Firefox triggers this behavior even if it does error.

like image 153
Alexander O'Mara Avatar answered Oct 16 '22 00:10

Alexander O'Mara