Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas converts int values to float in dataframe

I wrote a script that takes a csv file as an input, manipulates the data by using pandas and creates another csv file.

Everything is OK, however pandas converts integer values to double by default. e.g.

in csv before:

5f684ee8-7398-914d-9d87-7b44c37ef081,France,44,72000,No,isBool("true")

in csv after:

E84E685F-9873-4D91-9D87-7B44C37EF081,France,44.0,72000.0,No,True

Here 44 and 72000 are changed to 44.0 and 72000.0

I know how to turn them into int using apply() in dataframe, however this script is going to be generic and I am looking to configure pandas at first.

Basically, I expect pandas not to put .0 if it is not a floating number.

Thanks.

like image 730
skynyrd Avatar asked Oct 18 '22 16:10

skynyrd


2 Answers

Similar to B. M.'s answer, you can parse your floats like the following:

df.to_csv(float_format="%.10g")

This will force numbers to be written without exponent if they have a precision of at most 10 digits. so 2,147,483,647 will render as 2147483647 and 10-2 will render as 0.01. You will run into issues if you have big integers (bigger than 10 digits) as these will be rendered as exponents instead.

like image 59
elexis Avatar answered Oct 20 '22 11:10

elexis


As said in comments, some operations in pandas can change dtypes. see for exemple this page.

A solution can be :

df.to_csv(float_format="%.0f")

which round every (false) float to an integer format.

An exemple :

In [355]: pd.DataFrame(columns=list(range(6)), 
data=[['E84E685F-9873-4D91-9D87-7B44C37EF081', 'France', 44.0, 72000, 'No', True]]
).to_csv(float_format='%.f')
Out[355]: ',0,1,2,3,4,5\n0,E84E685F-9873-4D91-9D87-7B44C37EF081,France,44,72000,No,True\n'
like image 26
B. M. Avatar answered Oct 20 '22 11:10

B. M.