I have a big text:
"Big piece of text. This sentence includes 'regexp' word. And this sentence doesn't include that word"
I need to find substring that starts by 'this' and ends by 'word' but doesn't include word 'regexp'.
In this case the string: "this sentence doesn't include that word
" is exactly what I want to receive.
How can I do this via Regular Expressions?
The ?! n quantifier matches any string that is not followed by a specific string n.
Extracts the first matching substrings according to a regular expression.
With an ignore case option, the following should work:
\bthis\b(?:(?!\bregexp\b).)*?\bword\b
Example: http://www.rubular.com/r/g6tYcOy8IT
Explanation:
\bthis\b # match the word 'this', \b is for word boundaries (?: # start group, repeated zero or more times, as few as possible (?!\bregexp\b) # fail if 'regexp' can be matched (negative lookahead) . # match any single character )*? # end group \bword\b # match 'word'
The \b
surrounding each word makes sure that you aren't matching on substrings, like matching the 'this' in 'thistle', or the 'word' in 'wordy'.
This works by checking at each character between your start word and your end word to make sure that the excluded word doesn't occur.
Use lookahead asseterions.
When you want to check if a string does not contain another substring, you can write:
/^(?!.*substring)/
You must check also the beginning and the end of line for this
and word
:
/^this(?!.*substring).*word$/
Another problem here is that you don't want to find strings, you want to find sentences (if I understand your task right).
So the solution looks like this:
perl -e ' local $/; $_=<>; while($_ =~ /(.*?[.])/g) { $s=$1; print $s if $s =~ /^this(?!.*substring).*word[.]$/ };'
Example of usage:
$ cat 1.pl local $/; $_=<>; while($_ =~ /(.*?[.])/g) { $s=$1; print $s if $s =~ /^\s*this(?!.*regexp).*word[.]/i; }; $ cat 1.txt This sentence has the "regexp" word. This sentence doesn't have the word. This sentence does have the "regexp" word again. $ cat 1.txt | perl 1.pl This sentence doesn't have the word.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With