Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError: Can only compare identically-labeled Series objects python

Tags:

python

pandas

df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']]

When I execute the above command, I get the following error:

ValueError: Can only compare identically-labeled Series objects

What am I doing wrong?*

The dtypes of both the column are int64.

like image 621
Sumukh Avatar asked Jun 27 '18 16:06

Sumukh


People also ask

How do you fix can only compare identically-labeled Series objects?

If you try to compare DataFrames with different indexes using the equality comparison operator == , you will raise the ValueError: Can only compare identically-labeled DataFrame objects. You can solve this error by using equals instead of ==. For example, df1. equals(df2) , which ignores the indexes.

Can only compare identically-labeled Series objects error in pandas?

Reason for Error Can only compare identically-labeled series objects: It is Value Error, occurred when we compare 2 different DataFrames (Pandas 2-D Data Structure). If we compare DataFrames which are having different labels or indexes then this error can be thrown.

How do I compare objects in pandas?

Pandas DataFrame: equals() function The equals() function is used to test whether two objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.

How do I compare two data frames in Python?

The compare method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side. The compare method can only compare DataFrames of the same shape, with exact dimensions and identical row and column labels.


1 Answers

Pandas does almost all of its operations with intrinsic data alignment, meaning it uses indexes to compare, and perform operations.

You could avoid this error by converting one of the series to a numpy array using .values:

df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']].values

However, you are comparing row to row with no index alignment.

MCVE:

df1 = pd.DataFrame(np.arange(1,10), index=np.arange(1,10),columns=['A'])

df2 = pd.DataFrame(np.arange(11,20), index=np.arange(11,20),columns=['B'])

df1['A'] != df2['B']

Output:

ValueError: Can only compare identically-labeled Series objects

Change to numpy array:

df1['A'] != df2['B'].values

Output:

1    True
2    True
3    True
4    True
5    True
6    True
7    True
8    True
9    True
Name: A, dtype: bool
like image 82
Scott Boston Avatar answered Sep 20 '22 12:09

Scott Boston