Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

float64 with pandas to_csv

I'm reading a CSV with float numbers like this:

Bob,0.085 Alice,0.005 

And import into a dataframe, and write this dataframe to a new place

df = pd.read_csv(orig) df.to_csv(pandasfile) 

Now this pandasfile has:

Bob,0.085000000000000006 Alice,0.0050000000000000001 

What happen? maybe I have to cast to a different type like float32 or something?

Im using pandas 0.9.0 and numpy 1.6.2.

like image 912
avances123 Avatar asked Oct 13 '12 21:10

avances123


People also ask

What does to_csv do in pandas?

Pandas DataFrame to_csv() function converts DataFrame into CSV data. We can pass a file object to write the CSV data into a file. Otherwise, the CSV data is returned in the string format.

Does to_csv overwrite?

If the file already exists, it will be overwritten. If no path is given, then the Frame will be serialized into a string, and that string will be returned.

Does to_csv create directory?

to_csv does create the file if it doesn't exist as you said, but it does not create directories that don't exist. Ensure that the subdirectory you are trying to save your file within has been created first. This can easily be wrapped up in a function if you need to do this frequently.


1 Answers

As mentioned in the comments, it is a general floating point problem.

However you can use the float_format key word of to_csv to hide it:

df.to_csv('pandasfile.csv', float_format='%.3f') 

or, if you don't want 0.0001 to be rounded to zero:

df.to_csv('pandasfile.csv', float_format='%g') 

will give you:

Bob,0.085 Alice,0.005 

in your output file.

For an explanation of %g, see Format Specification Mini-Language.

like image 179
bmu Avatar answered Oct 12 '22 23:10

bmu