Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SyntaxError when accessing column named "class" in pandas DataFrame

I have pandas DataFrame named 'dataset' and it contains a column named 'class'

when I execute the following line I get SyntaxError: invalid syntax

print("Unique values in the Class column:", dataset.class.unique())

It works for another column names but not working with 'class'

How to use a keyword as column name in pandas ?

like image 395
Zizoo Avatar asked Jun 14 '26 00:06

Zizoo


1 Answers

class is a keyword in python. A rule of thumb: whenever you're dealing with column names that cannot be used as valid variable names in python, you must use the bracket notation to access: dataset['class'].unique().

There are, of course, exceptions here, but they work against your favour. For example, min/max is a valid variable name in python (even though it shadows builtins). In the case of pandas, however, you cannot refer to such a named column using the Attribute Access notation. There are more such exceptions, they're enumerated in the documentation.

A good place to begin with further reading is the documentation on Attribute Access. Specifically, the red Warning box), which I'm adding here for posterity:

  • You can use this access only if the index element is a valid Python identifier, e.g. s.1 is not allowed. See here for an explanation of valid identifiers.

  • The attribute will not be available if it conflicts with an existing method name, e.g. s.min is not allowed, but s['min'] is possible.

  • Similarly, the attribute will not be available if it conflicts with any of the following list: index, major_axis, minor_axis, items.

  • In any of these cases, standard indexing will still work, e.g. s['1'], s['min'], and s['index'] will access the corresponding element or column.

like image 105
cs95 Avatar answered Jun 16 '26 13:06

cs95



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!