I used read_csv()
to load a dataset that looks like this
userid
NaN
1.091178e+11
1.137856e+11
I want to convert the user ids to string. One solution is to add keep_default_na=False
to read_csv()
, which is suggested by this SO: Converting long integers to strings in pandas (to avoid scientific notation)
Let's say I don't want to use keep_default_na=False
. Is there any way to convert the user id column to str.
I tried df.userid.astype(str)
and I got 1.091178e+11
back. I was expecting the result in the expanded form not scientific form.
What should I do?
We can also convert a float to a string using the str() function.
The Float. toString() method can also be used to convert the float value to a String. The toString() is the static method of the Float class.
Use a string literal to suppress scientific notation Use the string literal syntax f"{num:. nf}" to represent num in decimal format with n places following the decimal point.
You can use map
or apply
, as mentioned in this comment:
print (df.userid.map(lambda x: '{:.0f}'.format(x)))
0 nan
1 109117800000
2 113785600000
Name: userid, dtype: object
df.userid = df.userid.map(lambda x: '{:.0f}'.format(x))
print (df)
userid
0 nan
1 109117800000
2 113785600000
I wondered whether map
would be faster, but it is the same:
#[300000 rows x 1 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
#print (df)
In [40]: %timeit (df.userid.map(lambda x: '{:.0f}'.format(x)))
1 loop, best of 3: 211 ms per loop
In [41]: %timeit (df.userid.apply(lambda x: '{:.0f}'.format(x)))
1 loop, best of 3: 210 ms per loop
Another solution is to_string
, but it is slow:
print(df.userid.to_string(float_format='{:.0f}'.format))
0 nan
1 109117800000
2 113785600000
In [41]: (df.userid.to_string(float_format='{:.0f}'.format))
1 loop, best of 3: 2.52 s per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With