Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return the position (row and column) of a data group in a DataFrame [Pandas]

Very good morning to all. I am working with Pandas, I have the following DataFrame:

import pandas as pd

df = pd.DataFrame([[90.0,10.0,10.0,20.0,10.0], [30.0,30.0,20.0,70.0,20.0], [30.0,20.0,10.0,20.0,10.0], [20.0,10.0,80.0,10.0,0.0]], index=['v_12','v_23','v_34','v_15'], columns=['C_1','C_2','C_3','C_4','C_5'])

       C_1   C_2   C_3   C_4   C_5
v_12  90.0  10.0  10.0  20.0  10.0
v_23  30.0  30.0  20.0  70.0  20.0
v_34  30.0  20.0  10.0  20.0  10.0
v_15  20.0  10.0  80.0  10.0   0.0

And I'm looking for some method/function in Pandas that allows me to know the position (row and column) of the values that are greater than or equal to 70 (I know this can be solved using loops and conditional statements, but I want to preside over them.)

I would like to get a result similar to this:

   Value   Row     Colum
0  90.0    v_12    C_1
1  70.0    v_23    C_4
2  80.0    v_15    C_3

I know that using the "where" method, I can get a DataFrame like this:

df.where(df>=70)

       C_1  C_2   C_3   C_4  C_5
v_12  90.0  NaN   NaN   NaN  NaN
v_23   NaN  NaN   NaN  70.0  NaN
v_34   NaN  NaN   NaN   NaN  NaN
v_15   NaN  NaN  80.0   NaN  NaN

But the truth is I don't know how to solve this exercise, without making use of a cycle that goes through element by element.

If you know of any function that can help me obtain a similar result, I would greatly appreciate your collaboration. Thanks.

like image 470
Johannes C. Avatar asked Jan 24 '23 08:01

Johannes C.


1 Answers

You could do:

df.reset_index().melt(id_vars = 'index', var_name = 'column').query('value>=70')

   index column  value
0   v_12    C_1   90.0
11  v_15    C_3   80.0
13  v_23    C_4   70.0

Or

df.stack().reset_index(name='value').query('value>=70')

   level_0 level_1  value
0     v_12     C_1   90.0
8     v_23     C_4   70.0
17    v_15     C_3   80.0

The above is credited to @Henry Ecker

You could also do:

ind = np.where(df>=70)

pd.DataFrame([(val, df.index[row], df.columns[col]) \
              for val, (row, col) in zip(df.values[ind], zip(*ind))])
      0     1    2
0  90.0  v_12  C_1
1  70.0  v_23  C_4
2  80.0  v_15  C_3
like image 58
KU99 Avatar answered Feb 15 '23 23:02

KU99