I am trying to take a logical match criteria like:
(("Foo" OR "Foo Bar" OR FooBar) AND ("test" OR "testA" OR "TestB")) OR TestZ
and apply this as a match against a file in pig using
result = filter inputfields by text matches (some regex expression here));
The problem is I have no idea how to trun the logical expression above into a regex expression for the matches method.
I have fiddled around with various things and the closest I have come to is something like this:
((?=.*?\bFoo\b | \bFoo Bar\b))(?=.*?\bTestZ\b)
Any ideas? I also need to try to do this conversion programatically if possible.
Some examples:
a - The quick brown Foo jumped over the lazy test (This should pass as it contains foo and test)
b - the was something going on in TestZ (This passes also as it contains testZ)
c - the quick brown Foo jumped over the lazy dog (This should fail as it contains Foo but not test,testA or TestB)
Thanks
Since you're using Pig you don't actually need an involved regular expression, you can just use the boolean operators supplied by pig combined with a couple of easy regular expressions, example:
T = load 'matches.txt' as (str:chararray);
F = filter T by ((str matches '.*(Foo|Foo Bar|FooBar).*' and str matches '.*(test|testA|TestB).*') or str matches '.*TestZ.*');
dump F;
You can use this regex for matches
method
^((?=.*\\bTestZ\\b)|(?=.*\\b(FooBar|Foo Bar|Foo)\\b)(?=.*\\b(testA|testB|test)\\b)).*
"Foo" OR "Foo Bar" OR "FooBar"
should be written as FooBar|Foo Bar|Foo
not Foo|Foo Bar|FooBar
to prevent matching only Foo
in string containing FooBar
or Foo Bar
.*
at the end of regex to let matches match entire string.Demo
String[] data = { "The quick brown Foo jumped over the lazy test",
"the was something going on in TestZ",
"the quick brown Foo jumped over the lazy dog" };
String regex = "^((?=.*\\bTestZ\\b)|(?=.*\\b(FooBar|Foo Bar|Foo)\\b)(?=.*\\b(testA|testB|test)\\b)).*";
for (String s : data) {
System.out.println(s.matches(regex) + " : " + s);
}
output:
true : The quick brown Foo jumped over the lazy test
true : the was something going on in TestZ
false : the quick brown Foo jumped over the lazy dog
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With