I'm reading a CSV with float numbers like this:
Bob,0.085 Alice,0.005
And import into a dataframe, and write this dataframe to a new place
df = pd.read_csv(orig) df.to_csv(pandasfile)
Now this pandasfile
has:
Bob,0.085000000000000006 Alice,0.0050000000000000001
What happen? maybe I have to cast to a different type like float32 or something?
Im using pandas 0.9.0 and numpy 1.6.2.
Pandas DataFrame to_csv() function converts DataFrame into CSV data. We can pass a file object to write the CSV data into a file. Otherwise, the CSV data is returned in the string format.
If the file already exists, it will be overwritten. If no path is given, then the Frame will be serialized into a string, and that string will be returned.
to_csv does create the file if it doesn't exist as you said, but it does not create directories that don't exist. Ensure that the subdirectory you are trying to save your file within has been created first. This can easily be wrapped up in a function if you need to do this frequently.
As mentioned in the comments, it is a general floating point problem.
However you can use the float_format
key word of to_csv
to hide it:
df.to_csv('pandasfile.csv', float_format='%.3f')
or, if you don't want 0.0001 to be rounded to zero:
df.to_csv('pandasfile.csv', float_format='%g')
will give you:
Bob,0.085 Alice,0.005
in your output file.
For an explanation of %g
, see Format Specification Mini-Language.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With