Pandas automatically aligns data indices of Series objects before applying the binary operators such as addition and subtraction, but this is not done when checking for equality. Why is this, and how do I overcome it?
Consider the following example:
In [15]: x = pd.Series(index=["A", "B", "C"], data=[1,2,3])
In [16]: y = pd.Series(index=["C", "B", "A"], data=[3,2,1])
In [17]: x
Out[17]:
A 1
B 2
C 3
dtype: int64
In [18]: y
Out[18]:
C 3
B 2
A 1
dtype: int64
In [19]: x==y
Out[19]:
A False
B True
C False
dtype: bool
In [20]: x-y
Out[20]:
A 0
B 0
C 0
dtype: int64
I am using pandas 0.12.0.
It is possible to compare two pandas Series with help of Relational operators, we can easily compare the corresponding elements of two series at a time. The result will be displayed in form of True or False. And we can also use a function like Pandas Series. equals() to compare two pandas series.
The fundamental operation of the pandas series. equals() method is used to compare two series for equality. it returns True if the two series have the same elements and shape, and returns False if the two series are unequal.
No, the order of the columns in the . csv file does not matter. There is, however, one data import type which does require some of the columns be in a specific order.
You can overcome it with:
In [5]: x == y.reindex(x.index)
Out[5]:
A True
B True
C True
dtype: bool
or
In [6]: x.sort_index() == y.sort_index()
Out[6]:
A True
B True
C True
dtype: bool
The 'why' is explained here: https://github.com/pydata/pandas/issues/1134#issuecomment-5347816
Update: there is an issue that dicusses this (https://github.com/pydata/pandas/issues/1134) and a PR to fix this (https://github.com/pydata/pandas/pull/6860)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With