Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas: output dataframe to csv with integers

I have a pandas.DataFrame that I wish to export to a CSV file. However, pandas seems to write some of the values as float instead of int types. I couldn't not find how to change this behavior.

Building a data frame:

df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'], dtype=int) x = pandas.Series([10,10,10], index=['a','b','d'], dtype=int) y = pandas.Series([1,5,2,3], index=['a','b','c','d'], dtype=int) z = pandas.Series([1,2,3,4], index=['a','b','c','d'], dtype=int) df.loc['x']=x; df.loc['y']=y; df.loc['z']=z 

View it:

>>> df     a   b    c   d x  10  10  NaN  10 y   1   5    2   3 z   1   2    3   4 

Export it:

>>> df.to_csv('test.csv', sep='\t', na_rep='0', dtype=int) >>> for l in open('test.csv'): print l.strip('\n')         a       b       c       d x       10.0    10.0    0       10.0 y       1       5       2       3 z       1       2       3       4 

Why do the tens have a dot zero ?

Sure, I could just stick this function into my pipeline to reconvert the whole CSV file, but it seems unnecessary:

def lines_as_integer(path):     handle = open(path)     yield handle.next()     for line in handle:         line = line.split()         label = line[0]         values = map(float, line[1:])         values = map(int, values)         yield label + '\t' + '\t'.join(map(str,values)) + '\n' handle = open(path_table_int, 'w') handle.writelines(lines_as_integer(path_table_float)) handle.close() 
like image 858
xApple Avatar asked Jun 13 '13 16:06

xApple


People also ask

How do I write pandas DataFrame to CSV?

By using pandas. DataFrame. to_csv() method you can write/save/export a pandas DataFrame to CSV File. By default to_csv() method export DataFrame to a CSV file with comma delimiter and row index as the first column.

How do you convert DataFrame values to integers?

Convert Column to int (Integer)Use pandas DataFrame. astype() function to convert column to int (integer), you can apply this on a specific column or on an entire DataFrame. To cast the data type to 64-bit signed integer, you can use numpy. int64 , numpy.

What does to_csv do in pandas?

Pandas DataFrame to_csv() function converts DataFrame into CSV data. We can pass a file object to write the CSV data into a file. Otherwise, the CSV data is returned in the string format.


1 Answers

The answer I was looking for was a slight variation of what @Jeff proposed in his answer. The credit goes to him. This is what solved my problem in the end for reference:

import pandas df = pandas.DataFrame(data, columns=['a','b','c','d'], index=['x','y','z']) df = df.fillna(0) df = df.astype(int) df.to_csv('test.csv', sep='\t') 
like image 162
xApple Avatar answered Oct 12 '22 03:10

xApple