I have a dataframe which I created from a dictionary like so:
pd.DataFrame.from_dict(dict1, dtype=str)
however , the datatypes for all fields are showing up as "Object"
I want to convert some of the columns to int and/or float, but I am unable to do it even after trying several ways.
I have tried the following ways :
df['duration'].astype(int)
df['duration'].astype(str).astype(int)
df['duration'].replace('"','').astype(int)
ValueError: invalid literal for int() with base 10: '"467900"'
df['cpu'].astype(float)
df['cpu'].astype(str).astype(float)
df['cpu'].replace('"','').astype(float)
ValueError: could not convert string to float: '"152.7"'
This is my dataframe :
duration realtime cpu
0 "268641" "46871" "152.7"
1 "208642" "2709" "107.1"
2 "208817" "2163" "108.2"
3 "238558" "9307" "141.1"
4 "208881" "2729" "106.7"
Please let me know how I can make this work.
Thanks in advance! Please let me know how I can get this to work.
Thanks in advance!
df=df.replace(regex='[^\d\.]', value='')#Remove any non digits except the decimal point
#Then now convert as you want
df['realtime']=df['realtime'].astype(int)
df['cpu']=df['cpu'].astype(float)
In Addition to @wwnde answer, you could also perfrom this operation in one line as follows:
df.replace(regex='[^\d\.]', value='').astype({
'duration' : int,
'realtime' : int,
'cpu' : float
})
Output:
duration realtime cpu
0 268641 46871 152.7
1 208642 2709 107.1
2 208817 2163 108.2
3 238558 9307 141.1
4 208881 2729 106.7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With