Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Setting a buffer of 0 in Pandas dataframe.to_csv

Tags:

python

pandas

csv

Is there a way to set a buffer of '0' when using the Pandas dataframe.to_csv()? I looked through the documentation and it appears to not allow that as an argument. Am I overlooking something?

Edit: I am asking because I am outputting dataframes which range in size from several hundred to many thousands of rows (always with the same 7 columns), and a later process that eventually examines the file is occasionally failing because sometimes it isn't finished being written.

I could of course introduce a delay (of 3-5 minutes), but I'd rather not arbitrarily slow down my code if I don't have to - I'd rather force the the code to wait for the completion of the output before moving on, and when writing files with open() it's nice to be able to set a buffer value of '0'.

like image 235
traggatmot Avatar asked Sep 26 '22 08:09

traggatmot


1 Answers

If I'm understanding your question correctly, you could implement the following. This snippet passes a StringIO instance as the first argument for to_csv, and calls seek(0):

import StringIO

#### your code here...assuming something like:
#### import pandas as pd
#### data = {"key1":"value1"}
#### dataframe = pd.DataFrame(data, index=dataframe)

buffer = StringIO.StringIO()
dataframe.to_csv(buffer)
buffer.seek(0)
output = buffer.getvalue()
buffer.close()

You could then manipulate output however you choose.

like image 110
Daniel Avatar answered Sep 28 '22 05:09

Daniel