If I want to drop duplicated index in a dataframe the following doesn't work for obvious reasons:
myDF.drop_duplicates(cols=index)
and
myDF.drop_duplicates(cols='index')
looks for a column named 'index'
If I want to drop an index I have to do:
myDF['index'] = myDF.index myDF= myDF.drop_duplicates(cols='index') myDF.set_index = myDF['index'] myDF= myDF.drop('index', axis =1)
Is there a more efficient way?
Example #1: Use Index. drop_duplicates() function to drop all the occurrences of the duplicate value except the first occurrence. Output : Let's drop all occurrences of duplicate value in the Index except the first occurrence.
Drop duplicates and reset the index But, if we need to reset the index of the resultant DataFrame, we can do that using the ignore_index parameter of DataFrame. drop_duplicate() . If ignore_index=True , it reset the row labels of resultant DataFrame to 0, 1, …, n – 1.
Simply: DF.groupby(DF.index).first()
The 'duplicated' method works for dataframes and for series. Just select on those rows which aren't marked as having a duplicate index:
df[~df.index.duplicated()]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With