Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print rows if values appear in any column of pandas dataframe

I would like to print all rows of a dataframe where I find the value '-' in any of the columns. Can someone please explain a way that is better than those described below?

This Q&A already explains how to do so by using boolean indexing but each column needs to be declared separately:

print df.ix[df['A'].isin(['-']) | df['B'].isin(['-']) | df['C'].isin(['-'])]

I tried the following but I get an error 'Cannot index with multidimensional key':

df.ix[df[df.columns.values].isin(['-'])]

So I used this code but I'm not happy with the separate printing for each column tested because it is harder to work with and can print the same row more than once:

import pandas as pd

d = {'A': [1,2,3], 'B': [4,'-',6], 'C': [7,8,'-']}
df = pd.DataFrame(d)

for i in range(len(d.keys())):  
    temp = df.ix[df.iloc[:,i].isin(['-'])]
    if temp.shape[0] > 0:
        print temp

Output looks like this:

   A  B  C
1  2  -  8

[1 rows x 3 columns]

   A  B  C
2  3  6  -

[1 rows x 3 columns]

Thanks for your advice.

like image 494
KieranPC Avatar asked Jul 09 '14 22:07

KieranPC


People also ask

How do you check if a value exists in a column pandas?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.

How do I get row values in pandas?

In the Pandas DataFrame we can find the specified row value with the using function iloc(). In this function we pass the row number as parameter.


2 Answers

Alternatively, you could do something like df[df.isin(["-"]).any(axis=1)], e.g.

>>> df = pd.DataFrame({'A': [1,2,3], 'B': ['-','-',6], 'C': [7,8,9]})
>>> df.isin(["-"]).any(axis=1)
0     True
1     True
2    False
dtype: bool
>>> df[df.isin(["-"]).any(axis=1)]
   A  B  C
0  1  -  7
1  2  -  8

(Note I changed the frame a bit so I wouldn't get the axes wrong.)

like image 59
DSM Avatar answered Oct 05 '22 23:10

DSM


you can do:

>>> idx = df.apply(lambda ts: any(ts == '-'), axis=1)
>>> df[idx]
   A  B  C
1  2  -  8
2  3  6  -

or

lambda ts: '-' in ts.values

note that in looks into the index not the values, so you need .values

like image 43
behzad.nouri Avatar answered Oct 05 '22 22:10

behzad.nouri