Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indexing a pandas dataframe by integer

Tags:

python

pandas

I can't seem to find an elegant way to index a pandas.DataFrame by an integer index. In the following example I want to get the value 'a' from the first element of the 'A' column.

import pandas
df = pandas.DataFrame(
    {'A':['a','b', 'c'], 'B':['f', 'g', 'h']}, 
    index=[10,20,30]
    )

I would expect df['A'].ix[0] and df['A'][10] both to return 'a'. The df['A'][10] does return 'a', but df['A'].ix[0] throws a KeyError: 0. The only way I could think of to get the value 'a' based on the index 0 is to use the following approach.

df['A'][df['A'].index[0]]

Is there a shorter way to get 'a' out of the dataframe, using the 0 index?

Update

As of pandas 0.11 there is a another way to index by integer.

df.iloc[0] # integer based, gives the first row
df.loc[10] # label based, gives the row with label 10

This supersedes the irow approach .

like image 414
SiggyF Avatar asked Jul 23 '12 21:07

SiggyF


People also ask

Can you index a Pandas DataFrame?

To set an index for a Pandas DataFrame, you can use the Pands . set_index method.

Does indexing in Pandas start with 0?

By default, it adds the current row index as a new column called 'index' in DataFrame, and it will create a new row index as a range of numbers starting at 0.

How do I change the numerical index of a panda?

To set the DataFrame index using existing columns or arrays in Pandas, use the set_index() method. The set_index() function sets the DataFrame index using existing columns. The index can replace the existing index or expand on it.


1 Answers

You get an error with df['A'].ix[0] because your indexing doesn't start at 0, it starts at 10. You can get the value you want with either of the following

df['A'].ix[10]
df['A'].irow(0)

The first uses by the correct index. The second command, which I suspect is what you want, finds the value by the row number, rather than by index value, and is technically only two characters longer than if df['A'].ix[0] worked.

Alternatively, you can reset the indices so that they will respond the way you expect for df['A'].ix[0]:

df2=df.reset_index()

This will preserve your old indices (10, 20, etc.) by moving them into a column called "index" in the df2 data frame. Then df2['A'].ix[0] will return 'a'. If you want to remove the old 10-based indices, you can insert the flag drop=True into the parenthesis of the reset_index function.

like image 188
Michelle Lynn Gill Avatar answered Oct 18 '22 20:10

Michelle Lynn Gill