Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why move your Javascript files to a different main domain that you also own?

I've noticed that just in the last year or so, many major websites have made the same change to the way their pages are structured. Each has moved their Javascript files from being hosted on the same domain as the page itself (or a subdomain of that), to being hosted on a differently named domain.

It's not simply parallelization

Now, there is a well known technique of spreading the components of your page across multiple domains to parallelize downloading. Yahoo recommends it as do many others. For instance, www.example.com is where your HTML is hosted, then you put images on images.example.com and javascripts on scripts.example.com. This gets around the fact that most browsers limit the number of simultaneous connections per server in order to be good net citizens.

The above is not what I am talking about.

It's not simply redirection to a content delivery network (or maybe it is--see bottom of question)

What I am talking about is hosting Javascripts specifically on an entirely different domain. Let me be specific. Just in the last year or so I've noticed that:

youtube.com has moved its .JS files to ytimg.com

cnn.com has moved its .JS files to cdn.turner.com

weather.com has moved its .JS files to j.imwx.com

Now, I know about content delivery networks like Akamai who specialize in outsourcing this for large websites. (The name "cdn" in Turner's special domain clues us in to the importance of this concept here).

But note with these examples, each site has its own specifically registered domain for this purpose, and its not the domain of a content delivery network or other infrastructure provider. In fact, if you try to load the home page off most of these script domains, they usually redirect back to the main domain of the company. And if you reverse lookup the IPs involved, they sometimes appear point to a CDN company's servers, sometimes not.

Why do I care?

Having formerly worked at two different security companies, I have been made paranoid of malicious Javascripts.

As a result, I follow the practice of whitelisting sites that I will allow Javascript (and other active content such as Java) to run on. As a result, to make a site like cnn.com work properly, I have to manually put cnn.com into a list. It's a pain in the behind, but I prefer it over the alternative.

When folks used things like scripts.cnn.com to parallelize, that worked fine with appropriate wildcarding. And when folks used subdomains off the CDN company domains, I could just permit the CDN company's main domain with a wildcard in front as well and kill many birds with one stone (such as *.edgesuite.net and *.akamai.com).

Now I have discovered that (as of 2008) this is not enough. Now I have to poke around in the source code of a page I want to whitelist, and figure out what "secret" domain (or domains) that site is using to store their Javascripts on. In some cases I've found I have to permit three different domains to make a site work.

Why did all these major sites start doing this?

EDIT: OK as "onebyone" pointed out, it does appear to be related to CDN delivery of content. So let me modify the question slightly based on his research...

Why is weather.com using j.imwx.com instead of twc.vo.llnwd.net?

Why is youtube.com using s.ytimg.com instead of static.cache.l.google.com?

There has to a reasoning behind this.

like image 264
Tim Farley Avatar asked Oct 02 '08 00:10

Tim Farley


People also ask

Why do we have to separate the JS file from HTML?

It separates HTML and code. It makes HTML and JavaScript easier to read and maintain. Cached JavaScript files can speed up page loads.

Where should I store JavaScript files?

JavaScript in body or head: Scripts can be placed inside the body or the head section of an HTML page or inside both head and body. JavaScript in head: A JavaScript function is placed inside the head section of an HTML page and the function is invoked when a button is clicked.

Does the order in which we load your JS files matter?

If I'm understanding your question I think you're asking if it matters where in a file a function/method is defined, and the answer is no, you can define them anywhere in a single source file. The JavaScript parser will read in all symbols before trying to run the code.


1 Answers

Your follow-up question is essentially: Assuming a popular website is using a CDN, why would they use their own TLD like imwx.com instead of a subdomain (static.weather.com) or the CDN's domain?

Well, the reason for using a domain they control versus the CDN's domain is that they retain control -- they could potentially even change CDNs entirely and only have to change a DNS record, versus having to update links in 1000s of pages/applications.

So, why use nonsense domain names? Well, a big thing with helper files like .js and .css is that you want them to be cached downstream by proxies and people's browsers as much as possible. If a person hits gmail.com and all the .js is loaded out of their browser cache, the site appears much snappier to them, and it also saves bandwidth on the server end (everybody wins). The problem is that once you send HTTP headers for really aggressive caching (i.e. cache me for a week or a year or forever), these files aren't ever reliably loaded from the server any more and you can't make changes/fixes to them because things will break in people's browsers.

So, what companies have to do is stage these changes and actually change the URLs of all of these files to force people's browsers to reload them. Cycling through domains like "a.imwx.com", "b.imwx.com" etc. is how this gets done.

By using a nonsense domain name, the Javascript developers and their Javascript sysadmin/CDN liaison counterparts can have their own domain name/DNS that they're pushing these changes through, that they're accountable/autonomous for.

Then, if any sort of cookie-blocking or script-blocking starts happening on the TLD, they just change from one nonsense TLD to kyxmlek.com or whatever. They don't have to worry about accidentally doing something evil that has countermeasure side effects on all of *.google.com.

like image 73
joelhardi Avatar answered Sep 24 '22 08:09

joelhardi