Wondering about techniques for terminating long running regular expression matches (java matcher.find() method). Maybe subclassing Matcher and adding some logic to terminate after x number of iterations?
Basically I'm generating regular expressions using a genetic algorithm, so I don't have a lot of control over them. Then I test each one against some text to see if they match a certain target area of the text.
So since I'm sort of randomly generating these regular expressions, I get some crazy stuff going on, and it eats a ton of cpu and some find() calls take a while to terminate. I'd rather just kill them after a while, but not sure of best way to do that.
So if anyone has ideas, please let me know.
There is a solution here which would solve your problem. (That question is the same problem yours is.)
Essentially, its a CharSequence that can notice thread interrupts.
The code from that answer:
/**
* CharSequence that noticed thread interrupts -- as might be necessary
* to recover from a loose regex on unexpected challenging input.
*
* @author gojomo
*/
public class InterruptibleCharSequence implements CharSequence {
CharSequence inner;
// public long counter = 0;
public InterruptibleCharSequence(CharSequence inner) {
super();
this.inner = inner;
}
public char charAt(int index) {
if (Thread.interrupted()) { // clears flag if set
throw new RuntimeException(new InterruptedException());
}
// counter++;
return inner.charAt(index);
}
public int length() {
return inner.length();
}
public CharSequence subSequence(int start, int end) {
return new InterruptibleCharSequence(inner.subSequence(start, end));
}
@Override
public String toString() {
return inner.toString();
}
}
Wrap your string with this and you can interrupt the thread.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With