Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In PANDAS, how to get the index of a known value?

If we have a known value in a column, how can we get its index-value? For example:

In [148]: a = pd.DataFrame(np.arange(10).reshape(5,2),columns=['c1','c2'])
In [149]: a
Out[149]:   
   c1  c2
0   0   1
1   2   3
2   4   5
........

As we know, we can get a value by the index corresponding to it, like this.

In [151]: a.ix[0,1]    In [152]: a.c2[0]   In [154]: a.c2.ix[0]   <--  use index
Out[151]: 1            Out[152]: 1         Out[154]: 1            <--  get value

But how to get the index by value?

like image 972
user2407991 Avatar asked May 22 '13 04:05

user2407991


People also ask

How do you access the index of a pandas series?

In order to access the series element refers to the index number. Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation.

How do you select an index from a data frame?

So, if you want to select the 5th row in a DataFrame, you would use df. iloc[[4]] since the first row is at index 0, the second row is at index 1, and so on. . loc selects rows based on a labeled index.


4 Answers

Using the .loc[] accessor:

In [25]: a.loc[a['c1'] == 8].index[0]
Out[25]: 4

Can also use the get_loc() by setting 'c1' as the index. This will not change the original dataframe.

In [17]: a.set_index('c1').index.get_loc(8)
Out[17]: 4
like image 31
gxpr Avatar answered Oct 11 '22 22:10

gxpr


There might be more than one index map to your value, it make more sense to return a list:

In [48]: a
Out[48]: 
   c1  c2
0   0   1
1   2   3
2   4   5
3   6   7
4   8   9

In [49]: a.c1[a.c1 == 8].index.tolist()
Out[49]: [4]
like image 76
waitingkuo Avatar answered Oct 11 '22 22:10

waitingkuo


The other way around using numpy.where() :

import numpy as np
import pandas as pd

In [800]: df = pd.DataFrame(np.arange(10).reshape(5,2),columns=['c1','c2'])

In [801]: df
Out[801]: 
   c1  c2
0   0   1
1   2   3
2   4   5
3   6   7
4   8   9

In [802]: np.where(df["c1"]==6)
Out[802]: (array([3]),)

In [803]: indices = list(np.where(df["c1"]==6)[0])

In [804]: df.iloc[indices]
Out[804]: 
   c1  c2
3   6   7

In [805]: df.iloc[indices].index
Out[805]: Int64Index([3], dtype='int64')

In [806]: df.iloc[indices].index.tolist()
Out[806]: [3]
like image 8
Surya Avatar answered Oct 11 '22 23:10

Surya


To get the index by value, simply add .index[0] to the end of a query. This will return the index of the first row of the result...

So, applied to your dataframe:

In [1]: a[a['c2'] == 1].index[0]     In [2]: a[a['c1'] > 7].index[0]   
Out[1]: 0                            Out[2]: 4                         

Where the query returns more than one row, the additional index results can be accessed by specifying the desired index, e.g. .index[n]

In [3]: a[a['c2'] >= 7].index[1]     In [4]: a[(a['c2'] > 1) & (a['c1'] < 8)].index[2]  
Out[3]: 4                            Out[4]: 3 
like image 6
RumbleFish Avatar answered Oct 11 '22 22:10

RumbleFish