If I import or create a pandas column that contains no spaces, I can access it as such:
from pandas import DataFrame df1 = DataFrame({'key': ['b', 'b', 'a', 'c', 'a', 'a', 'b'], 'data1': range(7)}) df1.data1
which would return that series for me. If, however, that column has a space in its name, it isn't accessible via that method:
from pandas import DataFrame df2 = DataFrame({'key': ['a','b','d'], 'data 2': range(3)}) df2.data 2 # <--- not the droid I'm looking for.
I know I can access it using .xs():
df2.xs('data 2', axis=1)
There's got to be another way. I've googled it like mad and can't think of any other way to google it. I've read all 96 entries here on SO that contain "column" and "string" and "pandas" and could find no previous answer. Is this the only way, or is there something better?
You can refer to column names that contain spaces or operators by surrounding them in backticks. This way you can also escape names that start with a digit, or those that are a Python keyword.
To strip whitespaces from column names, you can use str. strip, str. lstrip and str. rstrip.
You can get the column names from pandas DataFrame using df. columns. values , and pass this to python list() function to get it as list, once you have the data you can print it using print() statement.
Old post, but may be interesting: an idea (which is destructive, but does the job if you want it quick and dirty) is to rename columns using underscores:
df1.columns = [c.replace(' ', '_') for c in df1.columns]
I think the default way is to use the bracket method instead of the dot notation.
import pandas as pd df1 = pd.DataFrame({ 'key': ['b', 'b', 'a', 'c', 'a', 'a', 'b'], 'dat a1': range(7) }) df1['dat a1']
The other methods, like exposing it as an attribute are more for convenience.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With