I have a pandas.DataFrame
that I wish to export to a CSV file. However, pandas seems to write some of the values as float
instead of int
types. I couldn't not find how to change this behavior.
Building a data frame:
df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'], dtype=int) x = pandas.Series([10,10,10], index=['a','b','d'], dtype=int) y = pandas.Series([1,5,2,3], index=['a','b','c','d'], dtype=int) z = pandas.Series([1,2,3,4], index=['a','b','c','d'], dtype=int) df.loc['x']=x; df.loc['y']=y; df.loc['z']=z
View it:
>>> df a b c d x 10 10 NaN 10 y 1 5 2 3 z 1 2 3 4
Export it:
>>> df.to_csv('test.csv', sep='\t', na_rep='0', dtype=int) >>> for l in open('test.csv'): print l.strip('\n') a b c d x 10.0 10.0 0 10.0 y 1 5 2 3 z 1 2 3 4
Why do the tens have a dot zero ?
Sure, I could just stick this function into my pipeline to reconvert the whole CSV file, but it seems unnecessary:
def lines_as_integer(path): handle = open(path) yield handle.next() for line in handle: line = line.split() label = line[0] values = map(float, line[1:]) values = map(int, values) yield label + '\t' + '\t'.join(map(str,values)) + '\n' handle = open(path_table_int, 'w') handle.writelines(lines_as_integer(path_table_float)) handle.close()
By using pandas. DataFrame. to_csv() method you can write/save/export a pandas DataFrame to CSV File. By default to_csv() method export DataFrame to a CSV file with comma delimiter and row index as the first column.
Convert Column to int (Integer)Use pandas DataFrame. astype() function to convert column to int (integer), you can apply this on a specific column or on an entire DataFrame. To cast the data type to 64-bit signed integer, you can use numpy. int64 , numpy.
Pandas DataFrame to_csv() function converts DataFrame into CSV data. We can pass a file object to write the CSV data into a file. Otherwise, the CSV data is returned in the string format.
The answer I was looking for was a slight variation of what @Jeff proposed in his answer. The credit goes to him. This is what solved my problem in the end for reference:
import pandas df = pandas.DataFrame(data, columns=['a','b','c','d'], index=['x','y','z']) df = df.fillna(0) df = df.astype(int) df.to_csv('test.csv', sep='\t')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With