Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Output different precision by column with pandas.DataFrame.to_csv()?

Question

Is it possible to specify a float precision specifically for each column to be printed by the Python pandas package method pandas.DataFrame.to_csv?

Background

If I have a pandas dataframe that is arranged like this:

In [53]: df_data[:5]
Out[53]: 
    year  month  day       lats       lons  vals
0   2012      6   16  81.862745 -29.834254   0.0
1   2012      6   16  81.862745 -29.502762   0.1
2   2012      6   16  81.862745 -29.171271   0.0
3   2012      6   16  81.862745 -28.839779   0.2
4   2012      6   16  81.862745 -28.508287   0.0

There is the float_format option that can be used to specify a precision, but this applys that precision to all columns of the dataframe when printed.

When I use that like so:

df_data.to_csv(outfile, index=False,
                   header=False, float_format='%11.6f')

I get the following, where vals is given an inaccurate precision:

2012,6,16,  81.862745, -29.834254,   0.000000
2012,6,16,  81.862745, -29.502762,   0.100000
2012,6,16,  81.862745, -29.171270,   0.000000
2012,6,16,  81.862745, -28.839779,   0.200000
2012,6,16,  81.862745, -28.508287,   0.000000
like image 662
ryanjdillon Avatar asked Nov 15 '13 14:11

ryanjdillon


People also ask

What does to_csv do in pandas?

Pandas DataFrame to_csv() function converts DataFrame into CSV data. We can pass a file object to write the CSV data into a file. Otherwise, the CSV data is returned in the string format.

Does pandas to_csv overwrite?

When you write pandas DataFrame to an existing CSV file, it overwrites the file with the new contents. To append a DataFrame to an existing CSV file, you need to specify the append write mode using mode='a' .

What does to_csv return?

to_csv() function write the given series object to a comma-separated values (csv) file/format. Parameter : path_or_buf : File path or object, if None is provided the result is returned as a string.

Can a pandas series object hold data of different types?

Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.).


3 Answers

Change the type of column "vals" prior to exporting the data frame to a CSV file

df_data['vals'] = df_data['vals'].map(lambda x: '%2.1f' % x)

df_data.to_csv(outfile, index=False, header=False, float_format='%11.6f')
like image 189
hknust Avatar answered Oct 09 '22 06:10

hknust


The more current version of hknust's first line would be:

df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1}'.format(x))

To print without scientific notation:

df_data['vals'] = df_data['vals'].map(lambda x: '{0:.1f}'.format(x)) 
like image 32
Michael Szczepaniak Avatar answered Oct 09 '22 06:10

Michael Szczepaniak


This question is a bit old, but I'd like to contribute with a better answer, I think so:

formats = {'lats': '{:10.5f}', 'lons': '{:.3E}', 'vals': '{:2.1f}'}

for col, f in formats.items():
    df_data[col] = df_data[col].map(lambda x: f.format(x))

I tried with the solution here, but it didn't work for me, I decided to experiment with previus solutions given here combined with that from the link above.

like image 8
Nacho Avatar answered Oct 09 '22 08:10

Nacho