Is there a way to set a buffer of '0' when using the Pandas dataframe.to_csv()? I looked through the documentation and it appears to not allow that as an argument. Am I overlooking something?
Edit: I am asking because I am outputting dataframes which range in size from several hundred to many thousands of rows (always with the same 7 columns), and a later process that eventually examines the file is occasionally failing because sometimes it isn't finished being written.
I could of course introduce a delay (of 3-5 minutes), but I'd rather not arbitrarily slow down my code if I don't have to - I'd rather force the the code to wait for the completion of the output before moving on, and when writing files with open() it's nice to be able to set a buffer value of '0'.
If I'm understanding your question correctly, you could implement the following. This snippet passes a StringIO
instance as the first argument for to_csv
, and calls seek(0)
:
import StringIO
#### your code here...assuming something like:
#### import pandas as pd
#### data = {"key1":"value1"}
#### dataframe = pd.DataFrame(data, index=dataframe)
buffer = StringIO.StringIO()
dataframe.to_csv(buffer)
buffer.seek(0)
output = buffer.getvalue()
buffer.close()
You could then manipulate output
however you choose.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With