I am trying to convert categorical values into binary values using pandas. The idea is to consider every unique categorical value as a feature (i.e. a column) and put 1 or 0 depending on whether a particular object (i.e. row) was assigned to this category. The following is the code:
data = pd.read_csv('somedata.csv')
converted_val = data.T.to_dict().values()
vectorizer = DV( sparse = False )
vec_x = vectorizer.fit_transform( converted_val )
numpy.savetxt('out.csv',vec_x,fmt='%10.0f',delimiter=',')
My question is, how to save this converted data with the column names?. In the above code, I am able to save the data using numpy.savetxt
function, but this simply saves the array and the column names are lost. Alternatively, is there a much efficient way to perform the above operation?.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With