Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to ignore index comparison for pandas assert frame equal

Tags:

python

pandas

I try to compare below two dataframe with "check_index_type" set to False. According to the documentation, if it set to False, it shouldn't "check the Index class, dtype and inferred_type are identical". Did I misunderstood the documentation? how to compare ignoring the index and return True for below test?

I know I can reset the index but prefer not to.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.testing.assert_frame_equal.html

from pandas.util.testing import assert_frame_equal import pandas as pd d1 = pd.DataFrame([[1,2], [10, 20]], index=[0,2]) d2 = pd.DataFrame([[1, 2], [10, 20]], index=[0, 1]) assert_frame_equal(d1, d2, check_index_type=False)   AssertionError: DataFrame.index are different DataFrame.index values are different (50.0 %) [left]:  Int64Index([0, 2], dtype='int64') [right]: Int64Index([0, 1], dtype='int64') 
like image 679
Lisa Avatar asked Aug 02 '18 14:08

Lisa


People also ask

How do you assert if two DataFrames are equal?

equals. Test whether two objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements.

Does index in Pandas DataFrame have to be unique?

The column you want to index does not need to have unique values.

How do you drop the index of a data frame?

The most straightforward way to drop a Pandas dataframe index is to use the Pandas . reset_index() method. By default, the method will only reset the index, forcing values from 0 - len(df)-1 as the index. The method will also simply insert the dataframe index into a column in the dataframe.


2 Answers

Index is part of data frame , if the index are different , we should say the dataframes are different , even the value of dfs are same , so , if you want to check the value , using array_equal from numpy

d1 = pd.DataFrame([[1,2], [10, 20]], index=[0,2]) d2 = pd.DataFrame([[1, 2], [10, 20]], index=[0, 1]) np.array_equal(d1.values,d2.values) Out[759]: True 

For more info about assert_frame_equal in git

like image 73
BENY Avatar answered Sep 21 '22 08:09

BENY


If you really don't care about the index being equal, you can drop the index as follows:

assert_frame_equal(d1.reset_index(drop=True), d2.reset_index(drop=True)) 
like image 40
The Aelfinn Avatar answered Sep 19 '22 08:09

The Aelfinn