I am trying to return a column name and index based on the item value. I have something like this:
So let's day I am trying to return index and column names of all values where value is > 0.75.
for date, row in df.iterrows():
for item in row:
if item > .75:
print index, row
I wanted this to return "traffic and robbery". However this returns all the values. I did not find answer to this in documentation, online or here. Thank you in advance.
Using slightly different numbers (for no particular reason), you can stack to for a Series and then use boolean indexing:
In [11]: df.stack()
Out[11]:
assault assault 1.00
robbery 0.76
traffic 0.60
robbery assault 0.76
robbery 1.00
traffic 0.78
traffic assault 0.68
robbery 0.78
traffic 1.00
dtype: float64
In [12]: s = df.stack()
In [13]: s[(s!=1) & (s>0.77)]
Out[13]:
robbery traffic 0.78
traffic robbery 0.78
dtype: float64
You can do a bit of numpy to remove the duplicates, one way* is to 0 those not in the upper diagonal with triu (unfortunately this doesn't return as a DataFrame :( ):
In [21]: np.triu(df, 1)
Out[21]:
array([[ 0. , 0.76, 0.6 ],
[ 0. , 0. , 0.78],
[ 0. , 0. , 0. ]])
In [22]: s = pd.DataFrame(np.triu(df, 1), df.index, df.columns).stack() > 0.77
In [23]: s[s]
Out[23]:
robbery traffic True
dtype: bool
In [24]: s[s].index.tolist()
Out[24]: [('robbery', 'traffic')]
*I suspect there are more efficient ways...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With