Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The space character as a punctuator in JavaScript

Tags:

javascript

In chapter 7.7 (Punctuators) of the ECMAScript spec ( http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf ) the grid of punctuators appears to have a gap in row 3 of the last column. This is in fact the space character punctuator, correct?

I understand that space characters may be inserted optionally between tokens in the JavaScript code (in order to improve readability), however, I was wondering where they are actually required...

In order to find this out, I searched for space characters in the minified version of the jQuery library. These are my results:

A space is required... (see Update below)

... between a keyword and an identifier:

function x(){}
var x;
return x;
typeof x;
new X();

... between two keywords:

return false;
if(x){}else if(y){}else{}

These are the two cases that I identified. Are there any other cases?

Note: Space characters inside string literals are not regarded as punctuator tokens (obviously).

Update: As it turns out, a space character is not required in those cases. For example a keyword token and a identifier token have to be seperated by something, but that something does not have to be a space character. It could be any input element which is not a token (WhiteSpace, LineTerminator or Comment).

Also... It seems that the space character is regarded as a WhiteSpace input element, and not a token at all, which would mean that it's not a punctuator.

like image 883
Šime Vidas Avatar asked Nov 03 '10 13:11

Šime Vidas


2 Answers

Update (2021): The spec is much clearer now, and space is definitely not in the list of punctuators. Space is whitespace, which is covered in the White Space section.


Answer from 2010:

I don't think that gap is meant to be a space, no, I think it's just a gap (an unfortunate one). If they really meant to be listing a space, I expect they'd use "Whitespace" as they have elsewhere in the document. But whitespace as a punctuator doesn't really make sense.

I believe spaces (and other forms of whitespace) are delimiters. The spec sort of defines them by omission rather than explicitly. The space is required between function and x because otherwise you have the token functionx, which is not of course a keyword (though it could be a name token — e.g., a variable, property, or function name).

You need delimiters around some tokens (Identifiers and ReservedWords), because that's how we recognize where those tokens begin and end — an IdentifierName starts with an IdentifierStart followed by zero or more IdentifierParts, a class which doesn't include whitespace or any of the characters used for punctuators. Other tokens (Punctuators for instance) we can recognize without delimiters. I think that's about it, and so your two rules are pretty much just two examples of the same rule: IdentifierNames must be delimited (by whitespace, by punctuators, by beginning or end of file, ...).

Somewhat off-topic, but of course not all delimiters are equal. Line-breaking delimiters are sometimes treated specially by the grammar for the horror that is "semicolon insertion".

like image 176
T.J. Crowder Avatar answered Sep 21 '22 02:09

T.J. Crowder


Whitespaces are not required in any of these cases. You just have to write a syntax that is understandable for the parser. In other words: the machine has to know whether you're using a keyword like function or new or just defining another variable like newFunction.

Each keyword has to be delimited somehow - whitespaces are the most sensible and readable, however they can be replaced:

return/**/false;
return(false);
like image 20
Crozin Avatar answered Sep 19 '22 02:09

Crozin