Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare dataframe columns to series

I have a dataframe and a series, and want to compare the DF column-wise to series.

Dataframe (df) looks like:

1   1 4 7
2   2 3 1
3   2 3 9

Series (s) looks like:

1   3
2   4 
3   2

Want to conduct a boolean comparison (where columns values less than series values):

1   T F F
2   T T T
3   F F F

Of course I could do a loop, but there should be simpler ways to do that?

like image 455
Lothar Avatar asked Jan 15 '18 23:01

Lothar


People also ask

How do you compare elements in pandas Series?

equals() function test whether two objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements.

How do I find similar columns in pandas?

To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.


3 Answers

Use lt, and you can specify an axis.

df.lt(s, axis=0)

       1      2      3                   
1   True  False  False
2   True   True   True
3  False  False  False

The axis is 1 by default, and using the overloaded operator < doesn't give you as much flexibility that way. As DYZ mentioned in a comment, having the axis default to 1 here is an exception, because it usually defaults to 0 (in other functions such as apply and transform).


If the series and dataframe indexes don't align nicely, you can still get around that by comparing s.values instead.

df.lt(s.values, axis=0)

       1      2      3                   
1   True  False  False
2   True   True   True
3  False  False  False
like image 75
cs95 Avatar answered Oct 04 '22 08:10

cs95


(df.T<s).T
#       0      1      2
#0   True  False  False
#1   True   True   True
#2  False  False  False
like image 28
DYZ Avatar answered Oct 04 '22 07:10

DYZ


Using [:,None], convert you serise

df.values<s.values[:,None]
Out[513]: 
array([[ True, False, False],
       [ True,  True,  True],
       [False, False, False]])
like image 42
BENY Avatar answered Oct 04 '22 06:10

BENY