Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I compare two Python Pandas Series of different lengths?

Tags:

python

pandas

I have two Series of different lengths, and I want to get the indices for which both the indices and the amount are the same in both series.

Here are the Series:

ipdb> s1
s1
000007720          2000.00
group1            -3732.05
group t3           2432.12
group2           -38147.87
FSHLAJ           -36711.09
EWkayuwo             -3.22
Name: amount, dtype: float64
ipdb> s2
s2
000007720                 2000.00
group1                   -3732.05
group z                  12390.00
group y                  68633.43
group x                     25.00
group w                   3913.00
group v                 -12750.50
group u                    -53.49
group t                  -7500.00
group s                  -1575.82
group r                    -10.00
group q                   1800.00
group p                  -4510.34
EWFhjkaQU                  455.96
group2                  -38147.87
FSHLAJ                  -36711.09
GEKWJ                        5.54
Name: amount, dtype: float64

When I try to compare them, I get:

ipdb>s1 == s2
*** ValueError: Series lengths must match to compare

How can I achieve my objective?

like image 557
HaPsantran Avatar asked May 13 '15 12:05

HaPsantran


People also ask

How do you compare two pandas series?

In the pandas series constructor, there is a method called gt() which is used to apply the Greater Than condition between elements of two pandas series objects. The result of the gt() method is based on the comparison between elements of two series objects.

How do I compare two lengths of lists in python?

We can use the Python map() function along with functools. reduce() function to compare the data items of two lists. The map() method accepts a function and an iterable such as list, tuple, string, etc.

How do you find the difference between two series in python?

diff() is used to find difference between elements of the same series.


1 Answers

You want to use isin:

In [121]:

s2[s2.isin(s1)]
Out[121]:
000007720
group1    -3732.05
group2   -38147.87
FSHLAJ   -36711.09
Name: 2000.00, dtype: float64

I don't know which way round you wanted to perform the comparison, here is the other way:

In [122]:

s1[s1.isin(s2)]
Out[122]:
000007720
group1    -3732.05
group2   -38147.87
FSHLAJ   -36711.09
Name: 2000.00, dtype: float64

The problem with trying to do s1 == s2 is that it doesn't make sense comparing Series or arrays of different lengths.

If you want the indices to match also then add this as a condition:

In [131]:

s1[(s1.index.isin(s2.index)) & (s1.isin(s2))]
Out[131]:
000007720
group1    -3732.05
group2   -38147.87
FSHLAJ   -36711.09
Name: 2000.00, dtype: float64
like image 149
EdChum Avatar answered Sep 28 '22 13:09

EdChum