Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't access dataframe columns

I'm importing a dataframe from a csv file, but cannot access some of it's columns by name. What's going on?

In more concrete terms:

> import pandas

> jobNames = pandas.read_csv("job_names.csv")
> print(jobNames)

   job_id   job_name   num_judgements
0  933985        Foo              180
1  933130        Moo              175
2  933123        Goo              150
3  933094       Flue              120
4  933088        Tru              120

When I try to access the second column, I get an error:

> jobNames.job_name

AttributeError: 'DataFrame' object has no attribute 'job_name'

Strangely, I can access the job_id column thus:

> print(jobNames.job_id)

0    933985
1    933130
2    933123
3    933094
4    933088
Name: job_id, dtype: int64

Edit (to put the accepted answer in context):

It turns out that the first row of the csv file (with the column names) looks like this:

job_id, job_name, num_judgements

Note the spaces after each comma! Those spaces are retained in the column names:

> jobNames.columns[1]

' job_name'

which don't form valid python identifiers, so those columns aren't available as dataframe attributes. I can still access them dict-style:

> jobNames[' job_name']
like image 812
drevicko Avatar asked Aug 11 '16 10:08

drevicko


People also ask

How do you access DataFrame columns?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.

How do I get columns from a DataFrame in R?

To access a specific column in a dataframe by name, you use the $ operator in the form df$name where df is the name of the dataframe, and name is the name of the column you are interested in. This operation will then return the column you want as a vector.

How do I fix Pandas KeyError?

How to Fix the KeyError? We can simply fix the error by correcting the spelling of the key. If we are not sure about the spelling we can simply print the list of all column names and crosscheck.


1 Answers

When using pandas.read_csv pass in skipinitialspace=True flag to remove whitespace after CSV delimiters.

like image 170
Maxim Egorushkin Avatar answered Oct 05 '22 18:10

Maxim Egorushkin