Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't class attributes be named as reserved words in python?

It seems reserved words can not be used as attributes in python:

$ python
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class A:
>>>     global = 3
  File "<stdin>", line 2
    global = 3
           ^
SyntaxError: invalid syntax

This seems sensible, since it is ambiguous: am I using the global keyword here? Difficult to say.

But this is not sensible imho:

>>> class A: pass
>>> a = A()
>>> a.global = 3
  File "<stdin>", line 1
    a.global = 3
           ^
SyntaxError: invalid syntax
>>> a.def = 4
  File "<stdin>", line 1
    a.def = 4
        ^
SyntaxError: invalid syntax
>>> a.super = 5
>>> a.abs = 3
>>> a.set = 5
>>> a.False = 5
  File "<stdin>", line 1
    a.False = 5
          ^
SyntaxError: invalid syntax
>>> a.break = 5
  File "<stdin>", line 1
    a.break = 5
          ^
SyntaxError: invalid syntax

Why this limitation? I am not using the reserved words in isolation, but as a class attribute: there is not ambiguity at all. Why would python care about that?

like image 827
Daniel Gonzalez Avatar asked Oct 25 '17 06:10

Daniel Gonzalez


People also ask

Is Class A reserved keyword in Python?

The class keyword in Python is used to define classes. A class is a blueprint from which objects are created in Python. Classes bundle data and functionality together. Writing a class creates a new object type in your project.

Can reserved words be used as variable names in Python?

Keywords are the reserved words in Python. We cannot use a keyword as a variable name, function name or any other identifier.

Which word is not a reserved word in Python?

The correct answer to the question “Which of the following is not a Python Reserved word” is option (A). Iterate. All the other options are a Keyword or a Reserved word in Python.


2 Answers

It's nowhere near worth it.

Sure, you could allow it. Hack up the tokenizer and the parser so the tokenizer is aware of the parse context and emits NAME tokens instead of keyword tokens when the parser is expecting an attribute access, or just have it always emit NAME tokens instead of keywords after a DOT. But what would that get you?

You'd make the parser and tokenizer more complicated, and thus more bug-prone. You'd make things harder to read for a human reader. You'd restrict future syntax possibilities. You'd cause confusion when

Foo.for = 3

parses and

class Foo:
    for = 3

throws a SyntaxError. You'd make Python less consistent, harder to learn, and harder to understand.

And for all that, you'd gain... the ability to write x.for = 3. The best I can say for this is that it'd prevent something like x.fibble = 3 from breaking upon addition of a fibble keyword, but even then, all other uses of fibble would still break. Not worth it. If you want to use crazy attribute names, you have setattr and getattr.


Python does its best to make the syntax simple. Its parser is LL(1), and the restrictions of an LL(1) parser are considered beneficial specifically because they prevent going overboard with crazy grammar rules:

Simple is better than complex. This idea extends to the parser. Restricting Python's grammar to an LL(1) parser is a blessing, not a curse. It puts us in handcuffs that prevent us from going overboard and ending up with funky grammar rules like some other dynamic languages that will go unnamed, such as Perl.

Something like x.for = 3 is not in keeping with that design philosophy.

like image 195
user2357112 supports Monica Avatar answered Sep 30 '22 20:09

user2357112 supports Monica


To understand the reasons behind this limitation, you need to understand how computer languages work.

Initially, you have a text file. You feed this text to a string tokenizer (called lexer) which recognizes the lexical elements such as words, operators, comments, numbers, strings and so on. Basically, the lexer is not aware of anything except characters. It converts a text file to a stream of typed tokens.

This stream of tokens is then fed into a parser. Parser deals with higher-level constructs, such as method definition, class definition, import statement etc. For example, parser knows that a function definition starts with "def", followed by some name (token of type identifier), then a colon, and a bunch of indented lines. This means some words such as "def", "return", "if" are reserved for parser, because they are part of language grammar.

The result of parsing is a data structure called abstract syntax tree (AST). AST corresponds directly to contents and structure of text file. In AST, there are no keywords, because they have already served their purpose. On the other hand, identifiers (names of variables and functions etc.) are retained because they are needed later by compiler/interpreter.

In short, keywords exist to give the text its structure. Without structure, it is impossible for a program to deterministically analyze the text. If you try to use a keyword for something else, it breaks the structure. After the structure is analyzed, they are no longer needed. Inherently, this means the author of a language has to draw a line and reserve some words for structure, while leaving all others free for the programmer to use.

This is not just Pyhthon-specific. It's the same for every language. If you didn't have text files, you wouldn't need keywords. Technically, it would be possible for a language to overcome this limitation, but it would complicate things a lot without any real benefit. Having a parser separate from the rest of language makes so much sense that you just wouldn't want it any other way.

like image 31
jurez Avatar answered Sep 27 '22 20:09

jurez