df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']]
When I execute the above command, I get the following error:
ValueError: Can only compare identically-labeled Series objects
What am I doing wrong?*
The dtypes of both the column are int64
.
If you try to compare DataFrames with different indexes using the equality comparison operator == , you will raise the ValueError: Can only compare identically-labeled DataFrame objects. You can solve this error by using equals instead of ==. For example, df1. equals(df2) , which ignores the indexes.
Reason for Error Can only compare identically-labeled series objects: It is Value Error, occurred when we compare 2 different DataFrames (Pandas 2-D Data Structure). If we compare DataFrames which are having different labels or indexes then this error can be thrown.
Pandas DataFrame: equals() function The equals() function is used to test whether two objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.
The compare method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side. The compare method can only compare DataFrames of the same shape, with exact dimensions and identical row and column labels.
Pandas
does almost all of its operations with intrinsic data alignment, meaning it uses indexes to compare, and perform operations.
You could avoid this error by converting one of the series to a numpy
array using .values
:
df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']].values
However, you are comparing row to row with no index alignment.
MCVE:
df1 = pd.DataFrame(np.arange(1,10), index=np.arange(1,10),columns=['A'])
df2 = pd.DataFrame(np.arange(11,20), index=np.arange(11,20),columns=['B'])
df1['A'] != df2['B']
Output:
ValueError: Can only compare identically-labeled Series objects
Change to numpy array:
df1['A'] != df2['B'].values
Output:
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
9 True
Name: A, dtype: bool
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With