Can anyone help to explain why I get errors in some actions and not others when there is a duplicate column in a pandas.DataFrame
.
Minimal, Reproducible Example
import pandas as pd
df = pd.DataFrame(columns=['a', 'b', 'b'])
If I try and insert a list into column 'a'
I get an error about dimension mis-match:
df.loc[:, 'a'] = list(range(5))
Traceback (most recent call last):
...
ValueError: cannot copy sequence with size 5 to array axis with dimension 0
Similar with 'b'
:
df.loc[:, 'b'] = list(range(5))
Traceback (most recent call last):
...
ValueError: could not broadcast input array from shape (5) into shape (0,2)
However if I insert into an entirely new column, I don't get an error, unless I insert into 'a'
or 'b'
:
df.loc[:, 'c'] = list(range(5))
print(df)
a b b c
0 NaN NaN NaN 0
1 NaN NaN NaN 1
2 NaN NaN NaN 2
3 NaN NaN NaN 3
4 NaN NaN NaN 4
df.loc[:, 'a'] = list(range(5))
Traceback (most recent call last):
...
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)
All of these errors disappear if I remove the duplicate column 'b'
Additional information
pandas==1.0.2
To drop duplicate columns from pandas DataFrame use df. T. drop_duplicates(). T , this removes all columns that have the same data regardless of column names.
To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.
Pandas series aka columns has a unique() method that filters out only unique values from a column. The first output shows only unique FirstNames. We can extend this method using pandas concat() method and concat all the desired columns into 1 single column and then find the unique of the resultant column.
Why use loc and not just:
df['a'] = list(range(5))
This gives no error and seems to produce what you need:
a b b
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
same for creating column c:
df['c'] = list(range(5))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With