Pandas automatically aligns data indices of Series objects before applying the binary operators such as addition and subtraction, but this is not done when checking for equality. Why is this, and how do I overcome it? Consider the following example: <pre class="prettyprint"><code>In [15]: x = pd.Series(index=["A", "B", "C"], data=[1,2,3]) In [16]: y = pd.Series(index=["C", "B", "A"], data=[3,2,1]) In [17]: x Out[17]: A 1 B 2 C 3 dtype: int64 In [18]: y Out[18]: C 3 B 2 A 1 dtype: int64 In [19]: x==y Out[19]: A False B True C False dtype: bool In [20]: x-y Out[20]: A 0 B 0 C 0 dtype: int64 </code></pre> I am using pandas 0.12.0.

You can overcome it with: <pre class="prettyprint"><code>In [5]: x == y.reindex(x.index) Out[5]: A True B True C True dtype: bool </code></pre> or <pre class="prettyprint"><code>In [6]: x.sort_index() == y.sort_index() Out[6]: A True B True C True dtype: bool </code></pre> The 'why' is explained here: https://github.com/pydata/pandas/issues/1134#issuecomment-5347816 Update: there is an issue that dicusses this (https://github.com/pydata/pandas/issues/1134) and a PR to fix this (https://github.com/pydata/pandas/pull/6860)

Comparing pandas.Series for equality when they are in different orders

Tags:

python

pandas

Pandas automatically aligns data indices of Series objects before applying the binary operators such as addition and subtraction, but this is not done when checking for equality. Why is this, and how do I overcome it?

Consider the following example:

In [15]: x = pd.Series(index=["A", "B", "C"], data=[1,2,3])

In [16]: y = pd.Series(index=["C", "B", "A"], data=[3,2,1])

In [17]: x
Out[17]:
A    1
B    2
C    3
dtype: int64

In [18]: y
Out[18]:
C    3
B    2
A    1
dtype: int64

In [19]: x==y
Out[19]:
A    False
B     True
C    False
dtype: bool

In [20]: x-y
Out[20]:
A    0
B    0
C    0
dtype: int64

I am using pandas 0.12.0.

602

asked Apr 10 '14 09:04

Daniel Fortunov

1 Answers

You can overcome it with:

In [5]: x == y.reindex(x.index)
Out[5]: 
A    True
B    True
C    True
dtype: bool

In [6]: x.sort_index() == y.sort_index()
Out[6]: 
A    True
B    True
C    True
dtype: bool

The 'why' is explained here: https://github.com/pydata/pandas/issues/1134#issuecomment-5347816

Update: there is an issue that dicusses this (https://github.com/pydata/pandas/issues/1134) and a PR to fix this (https://github.com/pydata/pandas/pull/6860)

174

answered Oct 22 '22 08:10

joris

Related questions
                            
                                Python zipfile module creates multiple files with same name
                            
                                efficient way to find several rows above and below a subset of data
                            
                                resampling pandas series with numeric index
                            
                                How can I parse an arff file without using external libraries in Python
                            
                                Changing case of letters in unicode string containing accent and local letters
                            
                                How to style (rich text) in QListWidgetItem and QCombobox items? (PyQt/PySide)
                            
                                KeyError: 0 using multiprocessing in python
                            
                                Force selenium to use the portable firefox application
                            
                                Controller classes in Flask
                            
                                Check for binary content with Python requests library
                            
                                Can I use one route for multiple functions?
                            
                                How to get pip to point to newer version of Python
                            
                                Getting task by name from taskqueue
                            
                                Saving many arrays of different lengths
                            
                                Python mixin to extend class property
                            
                                Go subprocess communication
                            
                                Porting pyMC2 Bayesian A/B testing example to pyMC3
                            
                                Listing attributes of namedtuple subclass
                            
                                Tkinter canvas resizing automatically
                            
                                Why is PyQt executing my actions three times?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With