Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading Input Stream and splitting based on a delimiter

I have a scenario where I will get a large data as an input stream, which is going to have a delimiter and split it and process them. I want to process , this completely in memory , if its possible. Right now I am achieving this with the help of scanner as shown below , in the code:

package chap5_questions;

import java.util.Scanner;

public class paintjob_chp5 {

    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileNotFoundException;

    public class ScannerTest {
        public static void main(String[] args) {
            FileInputStream fin = null;
            try {
                fin = new FileInputStream(new File("E:\\Project\\Journalling\\docs\\readFile.txt"));

            } catch (FileNotFoundException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            java.util.Scanner scanner = new java.util.Scanner(fin, "UTF-8").useDelimiter("--AABBCCDDEEFFGGHHIIaabbccdd");
            String theString = null;

            while (scanner.hasNext()) {
                theString = scanner.next();
                System.out.println(theString);
                functionToProcessStreams(theString); // This will actually do the processing.

            }

            scanner.close();
        }
    }
}

However, I am not sure, if this is the most efficient way to do this. Another thing that comes to mind, is to use the read(b, off, len) function on inputstream, and then process each of the bytearray. However, for this I need to know , the index of the delimiters , which might again be reading the entire stream.

Please, suggest if there is any better way to do this.

like image 383
Anupam Avatar asked Jan 19 '26 02:01

Anupam


1 Answers

Using Scanner with useDelimiter() is efficient: it uses a (constructed) regular expression and will read your input only once.

On a side note: Even if it would cost a bit of efficiency, it is always a good idea to use legible code. This will allow you adapt your code faster and you will make less mistakes. Premature optimalization is the root of all evil.

like image 70
ljgw Avatar answered Jan 21 '26 16:01

ljgw



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!