I have a pd DataFrame with integers displayed as strings:
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('ABC'), index=['1', '2', '3', '4'])
frame = frame.apply(lambda x: x.astype(str))
This gives me a dataframe:
A B C
1 -0.890 0.162 0.477
2 -1.403 0.160 -0.570
3 -1.062 -0.577 -0.370
4 1.142 0.072 -1.732
If I type frame.type() I will get objects. Now I want to convert columns ['B':'C'] to numbers.
Imagine that I have dozens of columns and therefore I would like to slice them. So what I do is:
frame.loc[:,'B':'C'] = frame.loc[:,'B':'C'].apply(lambda x: pd.to_numeric(x, errors='coerce')
If I just wanted to alter column, say, B, I would type:
frame['B'] = frame['B'].apply(lambda x: pd.to_numeric(x, errors='coerce')
and that would convert B into into float64 BUT if I use it with .loc then nothing happens after I call DataFrame.info()!
Can someone help me? OF course I can just type all columns but I would like to get a more practical approach
You can pass kwargs to apply
assign
frame.assign(**frame.loc[:, 'B':'C'].apply(pd.to_numeric, errors='coerce'))
A B C
1 -1.50629471392 -0.578600 1.651437
2 -2.42667924339 -0.428913 1.265936
3 -0.866740402265 -0.678886 -0.094709
4 1.49138962612 -0.638902 -0.443982
update
frame.update(frame.loc[:, 'B':'C'].apply(pd.to_numeric, errors='coerce'))
frame
A B C
1 -1.50629471392 -0.578600 1.651437
2 -2.42667924339 -0.428913 1.265936
3 -0.866740402265 -0.678886 -0.094709
4 1.49138962612 -0.638902 -0.443982
you can generate a list of columns as follows:
In [96]: cols = frame.columns.to_series().loc['B':'C'].tolist()
and use this variable for selecting "columns of interest":
In [97]: frame[cols] = frame[cols].apply(lambda x: pd.to_numeric(x, errors='coerce'))
In [98]: frame.dtypes
Out[98]:
A object
B float64
C float64
dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With