Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do browsers parse "<? ?>" as "<!--? ?-->"?

Tags:

html

I have a file with following the HTML code:

<p><? comment ?></p>

Curl returns a normal response:

$ curl file:///path/to/the/file.html
<p><? comment ?></p>

But when I parse that response with Firefox 69 or Chrome 77, nothing is shown to me, because the HTML code is as follows:

<html><head></head><body><p><!--? comment ?--></p></body></html>

It looks very strange for me. Why does it happen?

Thanks.

like image 478
A'' Avatar asked Mar 03 '23 08:03

A''


1 Answers

That's part of HTML tokenizations rules.

The < character made your browser enter the tag-open-state.

12.2.5.6 Tag open state

Consume the next input character:

  • U+0021 EXCLAMATION MARK (!)
    • Switch to the markup declaration open state.
  • U+002F SOLIDUS (/)
    • Switch to the end tag open state.
  • ASCII alpha
    • Create a new start tag token, set its tag name to the empty string. Reconsume in the tag name state.
  • U+003F QUESTION MARK (?)
    • This is an unexpected-question-mark-instead-of-tag-name parse error. Create a comment token whose data is the empty string.
    • Reconsume in the bogus comment state.
  • ...

So your ? character is handled as a known error, and then the parser switches to the bogus comment state, which will put everything until the next > character inside the comment token.

like image 110
Kaiido Avatar answered May 01 '23 22:05

Kaiido