How to add dummies to Pandas DataFrame?

Tags:

I have a data_df that looks like:

   price vehicleType  yearOfRegistration    gearbox  powerPS  model  kilometer fuelType       brand notRepairedDamage  postalCode
0  18300       coupe                2011    manuell      190    NaN     125000   diesel        audi                ja       66954
1   9800         suv                2004  automatik      163  grand     125000   diesel        jeep               NaN       90480
2   1500  kleinwagen                2001    manuell       75   golf     150000   benzin  volkswagen              nein       91074
3   3600  kleinwagen                2008    manuell       69  fabia      90000   diesel       skoda              nein       60437
4    650   limousine                1995    manuell      102    3er     150000   benzin         bmw                ja       33775

Tried to convert classification columns (vehicleType) to dummies ("one hot encoding"):

columns = [ 'vehicleType' ] #, 'gearbox', 'model', 'fuelType', 'brand', 'notRepairedDamage' ]
for column in columns:
  dummies = pd.get_dummies(data_df[column], prefix=column)
  data_df.drop(columns=[column], inplace=True)
  data_df = data_df.add(dummies, axis='columns')

But the original data is missing:

  brand fuelType gearbox  kilometer model notRepairedDamage  ...  vehicleType_coupe  vehicleType_kleinwagen  vehicleType_kombi  vehicleType_limousine  vehicleType_suv  yearOfRegistration
0   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
1   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
2   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
3   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN
4   NaN      NaN     NaN        NaN   NaN               NaN  ...                NaN                     NaN                NaN                    NaN              NaN                 NaN

So, how to replace a given column with the dummies?

549

asked Feb 19 '19 01:02

B Seven

1 Answers

# Get one hot encoding of columns 'vehicleType'
one_hot = pd.get_dummies(data_df['vehicleType'])
# Drop column as it is now encoded
data_df = data_df.drop('vehicleType',axis = 1)
# Join the encoded df
data_df = data_df.join(one_hot)
data_df

162

answered Sep 21 '22 17:09

user2510479

Related questions
                            
                                How to ignore a warning inside a test using pytest?
                            
                                Does PIL image.convert("RGB") convert images to sRGB or AdobeRGB?
                            
                                Python-pptx: copy slide
                            
                                Is there any pytorch function can combine the specific continuous dimensions of tensor into one?
                            
                                Do the `if __name__ == "__main__": ` like idioms have a name of design pattern?
                            
                                convert a list of list into a list of list of tuple
                            
                                Python hypothesis: Ensure that input lists have same length
                            
                                Faster kNN algorithm in Python
                            
                                Maximum recursion depth exceeded. Multiprocessing and bs4
                            
                                Removing dupes in list of lists in Python
                            
                                Convert DatetimeIndex to datetime.date in pandas
                            
                                Cache decorator for numpy arrays
                            
                                Blur a specific part of an image
                            
                                Python: Method .as_matrix will be removed in a future version. Use .values instead [duplicate]
                            
                                Pythonic way of collapsing/grouping a list to aggregating max/min
                            
                                TypeError: cannot unpack non-iterable int object in Django views function
                            
                                How to Reverse Sort a nested list starting with Uppercase entries?
                            
                                Trouble Installing TA-Lib in Python 3.7
                            
                                Connect to SFTP with key file using Python pysftp
                            
                                ImportError: cannot import name 'transfer_markers' when testing with pytest

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to add dummies to Pandas DataFrame?

Tags:

python

pandas

dataframe

machine-learning

B Seven

People also ask

1 Answers

user2510479

Recent Activity

Donate For Us