I have some large text files which im going to preform consecutive matching on (just capturing, not replacing). Im thinking its not such a good idea to keep the whole file in memory, but rather use a Reader
.
What i know about the input is that if there's a match, its not going to span more than 5 lines. So my idea was to have some sort of buffer which just keeps these 5 lines, or so, do the first search, and continue. But it has to "know" where the regex match ended for this to work. e.g if the match ends at line 2 it should start the next search from here. Is it possible to do something like this in an efficient way?
You could use a Scanner
and the findWithinHorizon
method:
Scanner s = new Scanner(new File("thefile"));
String nextMatch = s.findWithinHorizon(yourPattern, 0);
From the api on findWithinHorizon
:
If horizon is 0, then the horizon is ignored and this method continues to search through the input looking for the specified pattern without bound. In this case it may buffer all of the input searching for the pattern.
A side note: When matching on multiple lines, you might want to look at the constants Pattern.MULTILINE
and Pattern.DOTALL
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With