Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Practices for java IO for creating a large CSV file

Tags:

java

io

csv

Hi I need to create few large CSV Files the order of entires could be 2 million. i so i was wondering how to do it efficiently.. and hence few questions crop up my mind

1 . when we Write File via a BufferedWriter how often should we flush? however i think that bufferedWriter maintains its own buffer and it flushes it automatically once the buffer is full if that is the case then why is flush method there at all ??

  1. As the file i am going to create would be big . so when i start writing the file will the file be automatically be committed to disk?? (before calling writer.close()) or the whole file remains in the main memory till i close the writer?.

    • by commiting i mean that no part of the already written portion is in main memory i.e it is ready for GC
like image 403
dpsdce Avatar asked Sep 27 '11 18:09

dpsdce


People also ask

How do I handle a large CSV file?

So, how do you open large CSV files in Excel? Essentially, there are two options: Split the CSV file into multiple smaller files that do fit within the 1,048,576 row limit; or, Find an Excel add-in that supports CSV files with a higher number of rows.

Can we create CSV file using Java?

A Comma-Separated Values (CSV) file is just a normal plain-text file, store data in a column by column, and split it by a separator (e.g normally it is a comma “, ”). OpenCSV is a CSV parser library for Java. OpenCSV supports all the basic CSV-type operations you are want to do.


1 Answers

  1. The BufferedWriter implementation should do a pretty good job of flushing when appropriate. In your case, you should never need to call flush.

    As for why there is a flush method, this is because sometimes you will want output written immediately rather than waiting for BufferedWriter's buffer to become full. BufferedWriter isn't just for files; it can also be used for writing to the console or a socket. For example, you may want to send some data over a network but not quite enough data to cause BufferedWriter to automatically flush. In order to send this data immediately, you would use flush.

  2. All the data you have written to the BufferedWriter will not remain in memory all at the same time. It is written out in pieces (flushed) as BufferedWriter's buffer fills up. Once you call close at the end, BufferedWriter will do one more final flush for everything remaining in its buffer that it hasn't already written to disk and close the file.

like image 194
Jack Edmonds Avatar answered Sep 17 '22 15:09

Jack Edmonds