Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting the integer index of a Pandas DataFrame row fulfilling a condition?

I have the following DataFrame:

   a  b  c b 2  1  2  3 5  4  5  6 

As you can see, column b is used as an index. I want to get the ordinal number of the row fulfilling ('b' == 5), which in this case would be 1.

The column being tested can be either an index column (as with b in this case) or a regular column, e.g. I may want to find the index of the row fulfilling ('c' == 6).

like image 587
Dun Peal Avatar asked Aug 13 '13 01:08

Dun Peal


People also ask

How do you determine the number of rows satisfying a condition in Pandas?

Using count() method in Python Pandas we can count the rows and columns. Count method requires axis information, axis=1 for column and axis=0 for row. To count the rows in Python Pandas type df. count(axis=1) , where df is the dataframe and axis=1 refers to column.

How do I find the index of a DataFrame based on a condition?

If you would like to find just the matched indices of the dataframe that satisfies the boolean condition passed as an argument, pandas. DataFrame. index() is the easiest way to achieve it. In the above snippet, the rows of column A matching the boolean condition == 1 is returned as output as shown below.

How do I get a row of Pandas DataFrame by index?

Often you may want to select the rows of a pandas DataFrame based on their index value. If you'd like to select rows based on integer indexing, you can use the . iloc function.

How can I get specific row number in Pandas?

In the Pandas DataFrame we can find the specified row value with the using function iloc(). In this function we pass the row number as parameter.


2 Answers

Use Index.get_loc instead.

Reusing @unutbu's set up code, you'll achieve the same results.

>>> import pandas as pd >>> import numpy as np   >>> df = pd.DataFrame(np.arange(1,7).reshape(2,3),                   columns = list('abc'),                   index=pd.Series([2,5], name='b')) >>> df    a  b  c b 2  1  2  3 5  4  5  6 >>> df.index.get_loc(5) 1 
like image 194
hlin117 Avatar answered Sep 22 '22 15:09

hlin117


You could use np.where like this:

import pandas as pd import numpy as np df = pd.DataFrame(np.arange(1,7).reshape(2,3),                   columns = list('abc'),                    index=pd.Series([2,5], name='b')) print(df) #    a  b  c # b          # 2  1  2  3 # 5  4  5  6 print(np.where(df.index==5)[0]) # [1] print(np.where(df['c']==6)[0]) # [1] 

The value returned is an array since there could be more than one row with a particular index or value in a column.

like image 36
unutbu Avatar answered Sep 25 '22 15:09

unutbu