Yesterday a colleague showed me the following postgres query. We were both surprised that it worked:
SELECT* FROM mytable;
Since I recently coded a parser for another language, I am trying to understand in more depth why this query "compiles" and returns the same results as SELECT * FROM mytable;
.
Presumably this is recognized as a valid query because during lexical analysis, postgres reads SELECT
from input as a token, and then searches for the next token, which it finds as *
, and so on - is that more or less what is going on here?
Also, does the postgres lexer/parser just happen to be robust enough to understand this query, or would other databases understand a similar SELECT*
query?
Usually lexers will add characters to the current token until it finds a character that can't belong to the current token, then it quits and starts over from where it couldn't go on before.
So what's going on here is that the lexer gobbles up SELECT
and sees that the next character is a *
which, because it's collecting a word, can't belong with SELECT
. So it stops, analyzes the SELECT
which turns out to be a keyword, and starts over with the *
, which it recognizes, and so on. It's the same reason why you get 4
from both 2*2
and 2 * 2
in other programming languages.
As for whether it will work on other databases, it all depends on the details of the lexical analyzer and the rules of the grammar.
Apparently, the tokenizer tokenizes on white space and special characters used in arithmetic.
Here's the BNF of SELECT statements: h2database.com:
SELECT [ TOP term ] [ DISTINCT | ALL ] selectExpression [,...]
FROM tableExpression [,...] [ WHERE expression ]
[ GROUP BY expression [,...] ] [ HAVING expression ]
[ { UNION [ ALL ] | MINUS | EXCEPT | INTERSECT } select ] [ ORDER BY order [,...] ]
[ LIMIT expression [ OFFSET expression ] [ SAMPLE_SIZE rowCountInt ] ]
[ FOR UPDATE ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With