Very good morning to all. I am working with Pandas, I have the following DataFrame:
import pandas as pd
df = pd.DataFrame([[90.0,10.0,10.0,20.0,10.0], [30.0,30.0,20.0,70.0,20.0], [30.0,20.0,10.0,20.0,10.0], [20.0,10.0,80.0,10.0,0.0]], index=['v_12','v_23','v_34','v_15'], columns=['C_1','C_2','C_3','C_4','C_5'])
C_1 C_2 C_3 C_4 C_5
v_12 90.0 10.0 10.0 20.0 10.0
v_23 30.0 30.0 20.0 70.0 20.0
v_34 30.0 20.0 10.0 20.0 10.0
v_15 20.0 10.0 80.0 10.0 0.0
And I'm looking for some method/function in Pandas that allows me to know the position (row and column) of the values that are greater than or equal to 70 (I know this can be solved using loops and conditional statements, but I want to preside over them.)
I would like to get a result similar to this:
Value Row Colum
0 90.0 v_12 C_1
1 70.0 v_23 C_4
2 80.0 v_15 C_3
I know that using the "where" method, I can get a DataFrame like this:
df.where(df>=70)
C_1 C_2 C_3 C_4 C_5
v_12 90.0 NaN NaN NaN NaN
v_23 NaN NaN NaN 70.0 NaN
v_34 NaN NaN NaN NaN NaN
v_15 NaN NaN 80.0 NaN NaN
But the truth is I don't know how to solve this exercise, without making use of a cycle that goes through element by element.
If you know of any function that can help me obtain a similar result, I would greatly appreciate your collaboration. Thanks.
You could do:
df.reset_index().melt(id_vars = 'index', var_name = 'column').query('value>=70')
index column value
0 v_12 C_1 90.0
11 v_15 C_3 80.0
13 v_23 C_4 70.0
Or
df.stack().reset_index(name='value').query('value>=70')
level_0 level_1 value
0 v_12 C_1 90.0
8 v_23 C_4 70.0
17 v_15 C_3 80.0
The above is credited to @Henry Ecker
You could also do:
ind = np.where(df>=70)
pd.DataFrame([(val, df.index[row], df.columns[col]) \
for val, (row, col) in zip(df.values[ind], zip(*ind))])
0 1 2
0 90.0 v_12 C_1
1 70.0 v_23 C_4
2 80.0 v_15 C_3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With