In chapter 7.7 (Punctuators) of the ECMAScript spec ( http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf ) the grid of punctuators appears to have a gap in row 3 of the last column. This is in fact the space character punctuator, correct?
I understand that space characters may be inserted optionally between tokens in the JavaScript code (in order to improve readability), however, I was wondering where they are actually required...
In order to find this out, I searched for space characters in the minified version of the jQuery library. These are my results:
A space is required... (see Update below)
... between a keyword and an identifier:
function x(){}
var x;
return x;
typeof x;
new X();
... between two keywords:
return false;
if(x){}else if(y){}else{}
These are the two cases that I identified. Are there any other cases?
Note: Space characters inside string literals are not regarded as punctuator tokens (obviously).
Update: As it turns out, a space character is not required in those cases. For example a keyword token and a identifier token have to be seperated by something, but that something does not have to be a space character. It could be any input element which is not a token (WhiteSpace
, LineTerminator
or Comment
).
Also... It seems that the space character is regarded as a WhiteSpace
input element, and not a token at all, which would mean that it's not a punctuator.
Update (2021): The spec is much clearer now, and space is definitely not in the list of punctuators. Space is whitespace, which is covered in the White Space section.
Answer from 2010:
I don't think that gap is meant to be a space, no, I think it's just a gap (an unfortunate one). If they really meant to be listing a space, I expect they'd use "Whitespace" as they have elsewhere in the document. But whitespace as a punctuator doesn't really make sense.
I believe spaces (and other forms of whitespace) are delimiters. The spec sort of defines them by omission rather than explicitly. The space is required between function
and x
because otherwise you have the token functionx
, which is not of course a keyword (though it could be a name token — e.g., a variable, property, or function name).
You need delimiters around some tokens (Identifiers and ReservedWords), because that's how we recognize where those tokens begin and end — an IdentifierName starts with an IdentifierStart followed by zero or more IdentifierParts, a class which doesn't include whitespace or any of the characters used for punctuators. Other tokens (Punctuators for instance) we can recognize without delimiters. I think that's about it, and so your two rules are pretty much just two examples of the same rule: IdentifierNames must be delimited (by whitespace, by punctuators, by beginning or end of file, ...).
Somewhat off-topic, but of course not all delimiters are equal. Line-breaking delimiters are sometimes treated specially by the grammar for the horror that is "semicolon insertion".
Whitespaces are not required in any of these cases. You just have to write a syntax that is understandable for the parser. In other words: the machine has to know whether you're using a keyword like function
or new
or just defining another variable like newFunction
.
Each keyword has to be delimited somehow - whitespaces are the most sensible and readable, however they can be replaced:
return/**/false;
return(false);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With