In both the bellow cases:
import pandas d = {'col1': 2, 'col2': 2.5} df = pandas.DataFrame(data=d, index=[0]) print(df['col2']) print(df.col2)
Both methods can be used to index on a column and yield the same result, so is there any difference between them?
The dot notation is used mostly as it is easier to read and comprehend and also less verbose. The main difference between dot notation and bracket notation is that the bracket notation allows us to access object properties using variable.
Dot notation is a strict subset of the brackets. The brackets are also the canonical way to "select subsets of data" from all objects in python. strings, tuples, lists, dictionaries, numpy arrays all use brackets to select subsets of data. medium.com. Selecting Subsets of Data in Pandas: Part 1.
The inner square brackets define a Python list with column names, whereas the outer brackets are used to select the data from a pandas DataFrame as seen in the previous example.
We must use bracket notation whenever we are accessing an object's property using a variable or when the property's key is a number or includes a symbol or is two words with a space.
The "dot notation", i.e. df.col2
is the attribute access that's exposed as a convenience.
You may access an index on a Series, column on a DataFrame, and an item on a Panel directly as an attribute:
df['col2']
does the same: it returns a pd.Series
of the column.
A few caveats about attribute access:
df.new_col = x
won't work, worse: it will silently actually create a new attribute rather than a column - think monkey-patching here)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With