Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct parsing of JavaScript octal escape sequence

Tags:

javascript

According to the ECMA spec an octal escape sequence is defined as

OctalEscapeSequence ::  
    OctalDigit [lookahead ∉ DecimalDigit]  
    ZeroToThree OctalDigit [lookahead ∉ DecimalDigit]  
    FourToSeven OctalDigit  
    ZeroToThree OctalDigit OctalDigit  

ZeroToThree :: one of
    0 1 2 3

FourToSeven :: one of
    4 5 6 7

According to this spec a string "\379" is not an octal escape \37 followed by 9. Am I reading this right? It doesn't satisfy the first rule, since 7 is a decimal digit. It doesn't satisfy the second, since 9 is a decimal digit. It doesn't satisfy the third, since three is not one of 4 5 6 7. Finally, it doesn't satisfy the fourth, since 9 is not an octal digit.

So what is the value of "\379" then? I tried a couple of JavaScript translators, they interpret it as an octal escape \37 followed by 9. Is it a bug in the interpreters?

UPDATE

I know that octal escape sequences are optional in the latest ECMA spec.

like image 534
facetus Avatar asked Jan 12 '14 06:01

facetus


People also ask

How do you escape a sequence in JavaScript?

Javascript uses '\' (backslash) in front as an escape character. To print quotes, using escape characters we have two options: For single quotes: \' (backslash followed by single quote) For double quotes: \” (backslash followed by double quotes)

What is octal escape sequence?

An octal escape sequence is a backslash followed by one, two, or three octal digits (0-7). It matches a character in the target sequence with the value specified by those digits. If all the digits are '0' the sequence is invalid.

Which of the following escape sequence is used in JavaScript to insert a horizontal tab in an HTML page?

\t : horizontal tab (U+0009 CHARACTER TABULATION)

Which one is not a valid escape sequence in JavaScript?

Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ ) in java.


1 Answers

The octal escape sequence is not part of the official spec implemented by modern browsers.

B.1 Additional Syntax

Past editions of ECMAScript have included additional syntax and semantics for specifying octal literals and octal escape sequences. These have been removed from this edition of ECMAScript. This non-normative annex presents uniform syntax and semantics for octal literals and octal escape sequences for compatibility with some older ECMAScript programs.

and it's explicitly disallowed in strict mode:

B.1.1 Numeric Literals

The syntax and semantics of 7.8.3 can be extended as follows except that this extension is not allowed for strict mode code


Given that, \379 is not an octal escape sequence since the negative lookahead of decimal digit bars both \3 and \37 from being treated as octal escape sequences.

This is then a syntax error since no other production matches it. Specifically,

CharacterEscapeSequence ::
  SingleEscapeCharacter
  NonEscapeCharacter

(which is what causes "\-" to be equal to "-") does not apply because digits are not in SingleEscapeCharacter nor in NonEscapeCharacter.


Is it a bug in the interpreters?

Maybe not if it only happens in unstrict mode. Interpreters are allowed to define additional syntax per chapter 16:

An implementation may extend program syntax and regular expression pattern or flag syntax.

An interpreter author could probably make a case that they're a conforming implementation with this behavior, they just extend syntax to support octal in a different way than that suggested by section B.1.

like image 131
Mike Samuel Avatar answered Sep 29 '22 03:09

Mike Samuel