I have a big test file with 70 million lines of text. I have to read the file line by line.
I used two different approaches:
InputStreamReader isr = new InputStreamReader(new FileInputStream(FilePath),"unicode"); BufferedReader br = new BufferedReader(isr); while((cur=br.readLine()) != null);
and
LineIterator it = FileUtils.lineIterator(new File(FilePath), "unicode"); while(it.hasNext()) cur=it.nextLine();
Is there another approach that can make this task faster?
Java 8 has added a new method called lines() in the Files class which can be used to read a file line by line in Java. The beauty of this method is that it reads all lines from a file as Stream of String, which is populated lazily as the stream is consumed.
We can use java.io.BufferedReader readLine() method to read file line by line to String. This method returns null when end of file is reached.
The line must be terminated by any one of a line feed ("\n") or carriage return ("\r"). In the following example, Demo. txt is read by FileReader class. The readLine() method of BufferedReader class reads file line by line, and each line appended to StringBuffer, followed by a linefeed.
1) I am sure there is no difference speedwise, both use FileInputStream internally and buffering
2) You can take measurements and see for yourself
3) Though there's no performance benefits I like the 1.7 approach
try (BufferedReader br = Files.newBufferedReader(Paths.get("test.txt"), StandardCharsets.UTF_8)) { for (String line = null; (line = br.readLine()) != null;) { // } }
4) Scanner based version
try (Scanner sc = new Scanner(new File("test.txt"), "UTF-8")) { while (sc.hasNextLine()) { String line = sc.nextLine(); } // note that Scanner suppresses exceptions if (sc.ioException() != null) { throw sc.ioException(); } }
5) This may be faster than the rest
try (SeekableByteChannel ch = Files.newByteChannel(Paths.get("test.txt"))) { ByteBuffer bb = ByteBuffer.allocateDirect(1000); for(;;) { StringBuilder line = new StringBuilder(); int n = ch.read(bb); // add chars to line // ... } }
it requires a bit of coding but it can be really faster because of ByteBuffer.allocateDirect
. It allows OS to read bytes from file to ByteBuffer
directly, without copying
6) Parallel processing would definitely increase speed. Make a big byte buffer, run several tasks that read bytes from file into that buffer in parallel, when ready find first end of line, make a String
, find next...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With