I am able to search suggestions that show the 'cause' of this error message, but not how to address it -
I encounter this problem every time I try to add a new column to a pandas dataframe by concatenating string values in 2 existing columns.
For instance:
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
It works if the first item and the second merged with ' ' are each separate dataframe/series.
These attempts are to have date & time merged into the same column so that they get recognized as datetime stamps by pandas library.
I am not certain if I am wrongly using the command or if it is the pandas library features are internally limited, as it keeps returning the duplicate axis
error msg. I understand the latter is highly unlikely hahaha ...
Could I hear some quick and easy solution out of this?
I mean, I thought sum/subtract and all these operations between column values in a dataframe would be quite easy. Shouldn't be too hard to have it visible on the table either right?
In order to make sure your DataFrame cannot contain duplicate values in the index, you can set allows_duplicate_labels flag to False for preventing the assignment of duplicate values.
In Python, you will get a valueerror: cannot reindex from a duplicate axis usually when you set an index to a specific value, reindexing or resampling the DataFrame using reindex method. If you look at the error message “cannot reindex from a duplicate axis“, it means that Pandas DataFrame has duplicate index values.
Dropping a Pandas Index Column Using reset_index The most straightforward way to drop a Pandas dataframe index is to use the Pandas . reset_index() method. By default, the method will only reset the index, forcing values from 0 - len(df)-1 as the index.
Indicate duplicate index values. Duplicated values are indicated as True values in the resulting array. Either all duplicates, all except the first, or all except the last occurrence of duplicates can be indicated. The value or values in a set of duplicates to mark as missing.
Operations between series require non-duplicated indices, otherwise Pandas doesn't know how to align values in calculations. This isn't the case with your data currently.
If you are certain that your series are aligned by position, you can call reset_index
on each dataframe:
wind = pd.DataFrame({'DATE (MM/DD/YYYY)': ['2018-01-01', '2018-02-01', '2018-03-01']})
temp = pd.DataFrame({'stamp': ['1', '2', '3']}, index=[0, 1, 1])
# ATTEMPT 1: FAIL
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
# ValueError: cannot reindex from a duplicate axis
# ATTEMPT 2: SUCCESS
wind = wind.reset_index(drop=True)
temp = temp.reset_index(drop=True)
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
print(wind)
DATE (MM/DD/YYYY) timestamp
0 2018-01-01 2018-01-01 1
1 2018-02-01 2018-02-01 2
2 2018-03-01 2018-03-01 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With