Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas convert float in scientific notation to string

Tags:

python

pandas

I used read_csv() to load a dataset that looks like this

userid
NaN
1.091178e+11
1.137856e+11

I want to convert the user ids to string. One solution is to add keep_default_na=False to read_csv(), which is suggested by this SO: Converting long integers to strings in pandas (to avoid scientific notation)

Let's say I don't want to use keep_default_na=False. Is there any way to convert the user id column to str.

I tried df.userid.astype(str) and I got 1.091178e+11 back. I was expecting the result in the expanded form not scientific form.

What should I do?

like image 940
Cheng Avatar asked Dec 15 '16 06:12

Cheng


People also ask

Can you convert float to string Python?

We can also convert a float to a string using the str() function.

Can you convert a float to a string?

The Float. toString() method can also be used to convert the float value to a String. The toString() is the static method of the Float class.

How do I get rid of scientific notation in Python?

Use a string literal to suppress scientific notation Use the string literal syntax f"{num:. nf}" to represent num in decimal format with n places following the decimal point.


1 Answers

You can use map or apply, as mentioned in this comment:

print (df.userid.map(lambda x: '{:.0f}'.format(x)))
0             nan
1    109117800000
2    113785600000
Name: userid, dtype: object

df.userid = df.userid.map(lambda x: '{:.0f}'.format(x))
print (df)
         userid
0           nan
1  109117800000
2  113785600000

I wondered whether map would be faster, but it is the same:

#[300000 rows x 1 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
#print (df)

In [40]: %timeit (df.userid.map(lambda x: '{:.0f}'.format(x)))
1 loop, best of 3: 211 ms per loop

In [41]: %timeit (df.userid.apply(lambda x: '{:.0f}'.format(x)))
1 loop, best of 3: 210 ms per loop

Another solution is to_string, but it is slow:

print(df.userid.to_string(float_format='{:.0f}'.format))
0            nan
1   109117800000
2   113785600000

In [41]: (df.userid.to_string(float_format='{:.0f}'.format))
1 loop, best of 3: 2.52 s per loop
like image 50
jezrael Avatar answered Nov 03 '22 00:11

jezrael