Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In what order does Python find syntax errors?

Working on creating syntax debugging exercise for students. We have the following example.

def five():
    print('five')
return 5

def hello();
   print('hello')

However when running the file the syntax error is

def hello();
           ^
SyntaxError: invalid syntax

I've looked all over but can not figure out why the compiler [sic] doesn't complain about the return keyword outside of the function, but instead first finds the semicolon error under it.

In what order does Python check the file syntax? Is this part of the specification or is it implemenation defined?

like image 729
Nick Mobley Avatar asked Oct 01 '20 18:10

Nick Mobley


People also ask

What are the 3 errors in Python?

There are mainly three kinds of distinguishable errors in Python: syntax errors, exceptions and logical errors.

How does Python handle syntax errors?

A SyntaxError occurs any time the parser finds source code it does not understand. This can be while importing a module, invoking exec, or calling eval(). Attributes of the exception can be used to find exactly what part of the input text caused the exception.


1 Answers

There are (at least) two phases involved: first, the token stream is parsed to produce a parse tree according to the grammar rules. A return statement is part of the flow_stmt rule, which itself is not restricted to being used inside a def statement. It is not a parse error to have a bare return statement. Some selected, relevant rules from the grammar:

single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
file_input: (NEWLINE | stmt)* ENDMARKER
stmt: simple_stmt | compound_stmt

simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
         import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
return_stmt: 'return' [testlist_star_expr]

The next phase involves turning the parse tree into a syntax tree. At this point, finding a return statement outside of a def statement would produce a syntax error.


A ;, on the other hand, is not part of the definition of funcdef, so a ; in place of the expected : would immediately trigger an error while building the parse tree.

funcdef: 'def' NAME parameters ['->' test] ':' [TYPE_COMMENT] func_body_suite

While it may be possible to report the bare return statement earlier, it clearly does not need to happen, so I would say this is an implementation detail.

like image 184
chepner Avatar answered Oct 05 '22 20:10

chepner