Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing pandas.Series for equality when they are in different orders

Tags:

python

pandas

Pandas automatically aligns data indices of Series objects before applying the binary operators such as addition and subtraction, but this is not done when checking for equality. Why is this, and how do I overcome it?

Consider the following example:

In [15]: x = pd.Series(index=["A", "B", "C"], data=[1,2,3])

In [16]: y = pd.Series(index=["C", "B", "A"], data=[3,2,1])

In [17]: x
Out[17]:
A    1
B    2
C    3
dtype: int64

In [18]: y
Out[18]:
C    3
B    2
A    1
dtype: int64

In [19]: x==y
Out[19]:
A    False
B     True
C    False
dtype: bool

In [20]: x-y
Out[20]:
A    0
B    0
C    0
dtype: int64

I am using pandas 0.12.0.

like image 602
Daniel Fortunov Avatar asked Apr 10 '14 09:04

Daniel Fortunov


People also ask

How do I compare Series values in pandas?

It is possible to compare two pandas Series with help of Relational operators, we can easily compare the corresponding elements of two series at a time. The result will be displayed in form of True or False. And we can also use a function like Pandas Series. equals() to compare two pandas series.

How do you know if two Series are equal with pandas?

The fundamental operation of the pandas series. equals() method is used to compare two series for equality. it returns True if the two series have the same elements and shape, and returns False if the two series are unequal.

Does column order matter pandas?

No, the order of the columns in the . csv file does not matter. There is, however, one data import type which does require some of the columns be in a specific order.


1 Answers

You can overcome it with:

In [5]: x == y.reindex(x.index)
Out[5]: 
A    True
B    True
C    True
dtype: bool

or

In [6]: x.sort_index() == y.sort_index()
Out[6]: 
A    True
B    True
C    True
dtype: bool

The 'why' is explained here: https://github.com/pydata/pandas/issues/1134#issuecomment-5347816

Update: there is an issue that dicusses this (https://github.com/pydata/pandas/issues/1134) and a PR to fix this (https://github.com/pydata/pandas/pull/6860)

like image 174
joris Avatar answered Oct 22 '22 08:10

joris