Say I have a script like this:
<script type="text/javascript" src="myScript.js">
I've seen some sources online that claim that if the charset
attribute is omitted, it defaults to ISO-8859-1. I've seen others that claim it assumes the same encoding as the HTML page that contains the script tag. What's the truth?
I need to know because my JavaScript file contains literal strings that will be inserted into the HTML, and which include non-ASCII characters like the Euro symbol (€). I realize that adding a charset attribute or just HTML encoding these characters should solve my problem, but I'd still like to understand the default behavior.
EDIT: To clarify one point, I need to know not just what the standards say, but how browsers actually act. The behavior described here: http://joconner.com/2008/09/javascript-file-encoding/ seems to suggest that browsers don't always assume ISO-8859-1.
The w3c has a standard way for a browser to determine the char encoding, you can read about it here: http://www.w3.org/TR/html4/charset.html#spec-char-encoding
To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):
- An HTTP "charset" parameter in a "Content-Type" field.
- A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
- The charset attribute set on an element that designates an external resource.
In addition to this list of priorities, the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents typically have a user-definable, local default character encoding which they apply in the absence of other indicators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With