Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match several strings but not specific ones

I'm working with Perl to search and match for strings on each line that match a criteria and would like to omit the lines that contain a specific string. What I mean is: Say I'm matching the string Mouse, but I'd like to omit if the line matches X123Y. Either strings can be found anywhere on the line.

Stackoverflow Mouse forum.       <--Match
Stackoverflow -Mouse- forum.     <--Match
Stackoverflow X123Y forum Mouse. <--Should not Match
Stackoverflow XYZ forum Mouse.   <--Should not Match

I hoped this would solve it since I'm using negative lookahead but doesn't seem to do the trick.

(?i)(\WMouse\W|(?!(X123Y|XYZ)).*$)

I'm doing something fundamentally wrong I suppose, but cannot see it now.

Any help?

like image 880
Ali Avatar asked Aug 14 '14 19:08

Ali


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

What does (? I do in regex?

(? i) makes the regex case insensitive. (? c) makes the regex case sensitive.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.


2 Answers

This regex should work for you:

^(?=.*?Mouse)(?:(?!(?:X123|XYZ)).)*$

RegEx Demo

like image 51
anubhava Avatar answered Oct 16 '22 23:10

anubhava


You can use the discard technique to keep with the content you want and discard the patterns you don't.

For example, using this regex:

.*X123Y.*|.*XYZ.*|(.*Mouse.*)

You will grab the content for the rightest pattern and discard the others..

Working demo

enter image description here

The idea is to use:

discard patt 1 | discard patt 2 | discard patt n | (grab this pattern)
like image 1
Federico Piazza Avatar answered Oct 17 '22 00:10

Federico Piazza