Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is 0[0] syntactically valid?

Tags:

javascript

When you do 0[0], the JS interpreter will turn the first 0 into a Number object and then try to access the [0] property of that object which is undefined.

There is no syntax error because the property access syntax 0[0] is allowed by the language grammar in this context. This structure (using terms in the Javascript grammar) is NumericLiteral[NumericLiteral].

The relevant part of the language grammar from section A.3 of the ES5 ECMAScript spec is this:

Literal ::
    NullLiteral
    BooleanLiteral
    NumericLiteral
    StringLiteral
    RegularExpressionLiteral

PrimaryExpression :
    this
    Identifier
    Literal
    ArrayLiteral
    ObjectLiteral
    ( Expression )

MemberExpression :
    PrimaryExpression
    FunctionExpression
    MemberExpression [ Expression ]
    MemberExpression . IdentifierName
    new MemberExpression Arguments    

So, one can follow the grammer through this progression:

MemberExpression [ Expression ]
PrimaryExpression [ Expression ]
Literal [ Expression ]
NumericLiteral [ Expression ]

And, similarly Expression can also eventually be NumericLiteral so after following the grammar, we see that this is allowed:

NumericLiteral [ NumericLiteral ]

Which means that 0[0] is an allowed part of the grammar and thus no SyntaxError.


Then, at run-time you are allowed to read a property that does not exist (it will just be read as undefined) as long as the source you are reading from either is an object or has an implicit conversion to an object. And, a numeric literal does indeed have an implicit conversion to an object (a Number object).

This is one of those often unknown features of Javascript. The types Number, Boolean and String in Javascript are usually stored internally as primitives (not full-blown objects). These are a compact, immutable storage representation (probably done this way for implementation efficiency). But, Javascript wants you to be able to treat these primitives like objects with properties and methods. So, if you try to access a property or method that is not directly supported on the primitive, then Javascript will temporarily coerce the primitive into an appropriate type of object with the value set to the value of the primitive.

When you use an object-like syntax on a primitive such as 0[0], the interpreter recognizes this as a property access on a primitive. Its response to this is to take the first 0 numeric primitive and coerce it into a full-blown Number object which it can then access the [0] property on. In this specific case, the [0] property on a Number object is undefined which is why that's the value you get from 0[0].

Here is an article on the auto-conversion of a primitive to an object for purposes of dealing with properties:

The Secret Life of Javascript Primitives


Here are the relevant portions of the ECMAScript 5.1 specification:

9.10 CheckObjectCoercible

Throws TypeError if value is undefined or null, otherwise returns true.

enter image description here

11.2.1 Property Accessors

  1. Let baseReference be the result of evaluating MemberExpression.
  2. Let baseValue be GetValue(baseReference).
  3. Let propertyNameReference be the result of evaluating Expression.
  4. Let propertyNameValue be GetValue(propertyNameReference).
  5. Call CheckObjectCoercible(baseValue).
  6. Let propertyNameString be ToString(propertyNameValue).
  7. If the syntactic production that is being evaluated is contained in strict mode code, let strict be true, else let strict be false.
  8. Return a value of type Reference whose base value is baseValue and whose referenced name is propertyNameString, and whose strict mode flag is strict.

An operative part for this question is step #5 above.

8.7.1 GetValue (V)

This describes how when the value being accessed is a property reference, it calls ToObject(base) to get the object version of any primitive.

9.9 ToObject

This describes how Boolean, Number and String primitives are converted to an object form with the [[PrimitiveValue]] internal property set accordingly.


As an interesting test, if the code was like this:

var x = null;
var a = x[0];

It would still not throw a SyntaxError at parse time as this is technically legal syntax, but it would throw a TypeError at runtime when you run the code because when the above Property Accessors logic is applied to the value of x, it will call CheckObjectCoercible(x) or call ToObject(x) which will both throw a TypeError if x is null or undefined.


Like most programming languages, JS uses a grammar to parse your code and convert it to an executable form. If there's no rule in the grammar that can be applied to a particular chunk of code, it throws a SyntaxError. Otherwise, the code is considered valid, no matter if it makes sense or not.

The relevant parts of the JS grammar are

Literal :: 
   NumericLiteral
   ...

PrimaryExpression :
   Literal
   ...

MemberExpression :
   PrimaryExpression
   MemberExpression [ Expression ]
   ...

Since 0[0] conforms to these rules, it's considered a valid expression. Whether it's correct (e.g. doesn't throw an error at run time) is another story, but yes it is. This is how JS evaluates expressions like someLiteral[someExpression]:

  1. evaluate someExpression (which can be arbitrary complex)
  2. convert the literal to a corresponding object type (numeric literals => Number, strings => String etc)
  3. call the get property operation on result(2) with the property name result(1)
  4. discard result(2)

So 0[0] is interpreted as

index = 0
temp = Number(0)
result = getproperty(temp, index) // it's undefined, but JS doesn't care
delete temp
return result

Here's a example of a valid, but incorrect expression:

null[0]

It's parsed fine, but at run time, the interpreter fails on step 2 (because null can't be converted to an object) and throws a run time error.


There are situations where you could validly subscript a number in Javascript:

-> 0['toString']
function toString() { [native code] }

While not immediately apparent why you would want to do this, subscripting in Javascript is equivalent to using dotted notation (albeit the dot notation limits you to using identifiers as keys).


I'd just like to note that this being valid syntax is not in any way unique to Javascript. Most languages will have a runtime error or a type error, but that's not the same thing as a syntax error. Javascript chooses to return undefined in many situations where another language might raise an exception, including when subscripting an object that does not have a property of the given name.

The syntax doesn't know the type of an expression (even a simple expression like a numeric literal), and will allow you to apply any operator to any expression. For example, attempting to subscript undefined or null causes a TypeError in Javascript. It's not a syntax error - if this is never executed (being on the wrong side of an if-statement), it won't cause any problems, whereas a syntax error is by definition always caught at compile time (eval, Function, etc, all count as compiling).


Because it is valid syntax, and even valid code to be interpreted. You can try to access any property of any object(and in this case 0 will be cast to a Number-object), and it will give you the value if it exists, otherwise undefined. Trying to access a property of undefined does not work however, so 0[0][0] would result in a runtime error. This would still be classified as valid syntax though. There's a difference of what is valid syntax and what won't cause runtime/compiletime errors.


Not only is the syntax valid the result does not have to be undefined though in most, if not all sane cases it will. JS is one of the most pure object oriented languages. Most so called OO languages are class oriented, in the sense that you can't change the form (it's tied to the class) of the object once created, only the state of the object. In JS you can change the state as well as the form of the object and this you do more often than you think. This ability makes for some rather obscure code, if you misuse it. Numerals are immutable, so you can't change the object itself, not it's state nor it's form so you could do

0[0] = 1;

which is an valid assignment expression that returns 1 but doesn't actually assign anything, The numeral 0 is immutable. Which in itself is somewhat odd. You can have a valid and correct (executable) assingment expression, that doesn't assign anything(*). However the type of the numeral is a mutable object so you can mutate the type, and the changes will cascade down the prototype chain.

Number[0] = 1;
//print 1 to the console
console.log(0[0]);
//will also print 1 to the console because all integers have the same type
console.log(1[0]); 

of course it's a far cry from the sane use category but the language is specified to allow for this because in other scenarios, extending the objects capabilities actually makes a lot of sense. It's how jQuery plugins hook into the jQuery object to give an example.

(*) It does actually assign the value 1 to the property of an object, however there's no way you can reference that (transcient) object and it will thus be collected at the nexx GC pass


In JavaScript, everything is object, so when interpreter parse it, it treats 0 as a object and tries to return 0 as a property. The same thing happens when you try to access 0th element of true or ""(empty string).

Even if you set 0[0]=1, it will set the property and its value in memory, but while you access 0 it treats as a number (Don't get confuse between treating as Object and number here.)