I have a fruit dataset that contains Name, Colour, Weight, Size, Seeds
Fruit dataset
Name Colour Weight Size Seeds Unnamed
Apple Apple Red 10.0 Big Yes
Apple Apple Red 5.0 Small Yes
Pear Pear Green 11.0 Big Yes
Banana Banana Yellow 4.0 Small Yes
Orange Orange Orange 5.0 Small Yes
The problem is that, the colour column is a duplicated column of name and the values are shifted 1 column to the right, creating a useless column (Unnamed) that contains values that belong to column Seeds. Is there a easy way to remove the duplicated values in Colour and shift back the rest of the column values 1 column to the left from weight onwards. I hope i am not confusing anyone here.
Desire result
Fruit dataset
Name Colour Weight Size Seeds Unnamed(will be dropped)
Apple Red 10.0 Big Yes
Apple Red 5.0 Small Yes
Pear Green 11.0 Big Yes
Banana Yellow 4.0 Small Yes
Orange Orange 5.0 Small Yes
you can do it this way:
In [23]: df
Out[23]:
Name Colour Weight Size Seeds Unnamed
0 Apple Apple Red 10.0 Big Yes
1 Apple Apple Red 5.0 Small Yes
2 Pear Pear Green 11.0 Big Yes
3 Banana Banana Yellow 4.0 Small Yes
4 Orange Orange Orange 5.0 Small Yes
In [24]: cols = df.columns[:-1]
In [25]: cols
Out[25]: Index(['Name', 'Colour', 'Weight', 'Size', 'Seeds'], dtype='object')
In [26]: df = df.drop('Colour', 1)
In [27]: df.columns = cols
In [28]: df
Out[28]:
Name Colour Weight Size Seeds
0 Apple Red 10.0 Big Yes
1 Apple Red 5.0 Small Yes
2 Pear Green 11.0 Big Yes
3 Banana Yellow 4.0 Small Yes
4 Orange Orange 5.0 Small Yes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With