Java 9 takeWhile and dropWhile to read and skip certain lines

Tags:

I have a text file that contains multiple reports in it. Each report starts with a literal "REPORT ID" and have a specific value i.e ABCD. For simple case, I want to extract data of only those reports which have their value ABCD for example. And for complexity, I want to extract data of only those reports which have TAG1 value (2nd line)as 1000375351 and report value is same as ABCD.

I have done it using traditional way. My decideAndExtract(String line) function have the required logic. But how can I use Java 9 streams takeWhile and dropWhile methods to efficiently deal with it?

Click to copy

try (Stream<String> lines = Files.lines(filePath)) {
    lines.forEach(this::decideAndExtract);
}

Sample text file data:

Click to copy

REPORT ID: ABCD    
TAG1: 1000375351 PR
DATA1: 7399910002 T
DATA2: 4754400002 B
DATA3     : 1000640
Some Lines Here    
REPORT ID: WXYZ    
TAG1: 1000375351 PR
DATA1: 7399910002 T
DATA2: 4754400002 B
DATA3     : 1000640
Some Lines Here    
REPORT ID: ABCD    
TAG1: 1000375351 PR
DATA1: 7399910002 T
DATA2: 4754400002 B
DATA3     : 1000640
Some Lines Here

706

asked Aug 02 '19 19:08

Tishy Tash

1 Answers

It seems to be a common anti-pattern to go for Files.lines, whenever a Stream over a file is needed, regardless of whether processing individual lines is actually needed.

The first tool of your choice, when pattern matching over a file is needed, should be Scanner:

Click to copy

Pattern p = Pattern.compile(
    "REPORT ID: ABCD\\s*\\R"
   +"TAG1\\s*:\\s*(.*?)\\R"
   +"DATA1\\s*:\\s*(.*?)\\R"
   +"DATA2\\s*:\\s*(.*?)\\R"
   +"DATA3\\s*:\\s*(.*?)\\R"); // you can keep this in a static final field

try(Scanner sc = new Scanner(filePath, StandardCharsets.UTF_8);
    Stream<MatchResult> st = sc.findAll(p)) {

    st.forEach(mr -> System.out.println("found tag1: " + mr.group(1)
        + ", data: "+String.join(", ", mr.group(2), mr.group(3), mr.group(4))));
}

It's easy to adapt the pattern, i.e. use

Click to copy

Pattern p = Pattern.compile(
    "REPORT ID: ABCD\\s*\\R"
   +"TAG1: (1000375351 PR)\\R"
   +"DATA1\\s*:\\s*(.*?)\\R"
   +"DATA2\\s*:\\s*(.*?)\\R"
   +"DATA3\\s*:\\s*(.*?)\\R"); // you can keep this in a static final field

as pattern to fulfill your more complex criteria.

But you could also provide arbitrary filter conditions in the Stream:

Click to copy

Pattern p = Pattern.compile(
    "REPORT ID: (.*?)\\s*\\R"
   +"TAG1: (.*?)\\R"
   +"DATA1\\s*:\\s*(.*?)\\R"
   +"DATA2\\s*:\\s*(.*?)\\R"
   +"DATA3\\s*:\\s*(.*?)\\R"); // you can keep this in a static final field

try(Scanner sc = new Scanner(filePath, StandardCharsets.UTF_8);
    Stream<MatchResult> st = sc.findAll(p)) {

    st.filter(mr -> mr.group(1).equals("ABCD") && mr.group(2).equals("1000375351 PR"))
      .forEach(mr -> System.out.println(
          "found data: " + String.join(", ", mr.group(3), mr.group(4), mr.group(5))));
}

allowing more complex constructs than the equals calls of the example. (Note that the group numbers changed for this example.)

E.g., to support a variable order of the data items after the “REPORT ID”, you can use

Click to copy

Pattern p = Pattern.compile("REPORT ID: (.*?)\\s*\\R(((TAG1|DATA[1-3])\\s*:.*?\\R){4})");
Pattern nl = Pattern.compile("\\R"), sep = Pattern.compile("\\s*:\\s*");

try(Scanner sc = new Scanner(filePath, StandardCharsets.UTF_8);
    Stream<MatchResult> st = sc.findAll(p)) {

    st.filter(mr -> mr.group(1).equals("ABCD"))
      .map(mr -> nl.splitAsStream(mr.group(2))
          .map(s -> sep.split(s, 2))
          .collect(Collectors.toMap(a -> a[0], a -> a[1])))
      .filter(map -> "1000375351 PR".equals(map.get("TAG1")))
      .forEach(map -> System.out.println("found data: " + map));
}

findAll is available in Java 9, but if you have to support Java 8, you can use the findAll implementation of this answer.

answered Oct 11 '22 19:10

Holger

Related questions
                            
                                How to get the array index in Lodash _.each
                            
                                How to group a stream to a map by using a specific key and value?
                            
                                Date one day backwards after select from MySQL DB
                            
                                Wiremock error - there are no stub mappings in this WireMock instance
                            
                                Is there a way to hide Java methods in Kotlin?
                            
                                Selenium upload file: file not found [docker]
                            
                                why addConverterFactory is need in Retrofit
                            
                                Java parenthesis replacement with empty string
                            
                                What's a method that works exactly like Math.floorMod() but with floats instead of ints?
                            
                                Retrieve an Akka actor or create it if it does not exist
                            
                                grouping and sum with nested lists
                            
                                How to fix "java.sql.SQLSyntaxErrorException: Unknown column 'product0_.return_policy' in 'field list' " exception?
                            
                                Failed to convert value of type 'java.lang.String' to required type 'java.time.LocalDate';
                            
                                Connecting a Slider and Spinner that has a StringConverter
                            
                                How to flatten nested map of lists with Java 8 Stream? [duplicate]
                            
                                Unique index or primary key violation: "PRIMARY KEY ON PUBLIC.xxx"; SQL statement
                            
                                Understanding why is it unsafe to start a thread inside a constructor in terms of the Java memory model [duplicate]
                            
                                Can explicit type parameters redundant?
                            
                                confusion in java 8 method referencing for equals method implementation with BiPredicate
                            
                                How GC knows if object in old heap references an object in young heap?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Java 9 takeWhile and dropWhile to read and skip certain lines

Tags:

java

java-9

java-stream

Tishy Tash

People also ask

1 Answers

Holger

Recent Activity

Donate For Us