Regular expression negative lookahead

Tags:

In my home directory I have a folder drupal-6.14 that contains the Drupal platform.

From this directory I use the following command:

find drupal-6.14 -type f -iname '*' | grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' | xargs tar -czf drupal-6.14.tar.gz

What this command does is gzips the folder drupal-6.14, excluding all subfolders of drupal-6.14/sites/ except sites/all and sites/default, which it includes.

My question is on the regular expression:

grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*'

The expression works to exclude all the folders I want excluded, but I don't quite understand why.

It is a common task using regular expressions to

Match all strings, except those that don't contain subpattern x. Or in other words, negating a subpattern.

I (think) I understand that the general strategy to solve these problems is the use of negative lookaheads, but I've never understood to a satisfactory level how positive and negative look(ahead/behind)s work.

Over the years, I've read many websites on them. The PHP and Python regex manuals, other pages like http://www.regular-expressions.info/lookaround.html and so forth, but I've never really had a solid understanding of them.

Could someone explain, how this is working, and perhaps provide some similar examples that would do similar things?

-- Update One:

Regarding Andomar's response: can a double negative lookahead be more succinctly expressed as a single positive lookahead statement:

i.e Is:

'drupal-6.14/(?!sites(?!/all|/default)).*'

equivalent to:

'drupal-6.14/(?=sites(?:/all|/default)).*'

???

-- Update Two:

As per @andomar and @alan moore - you can't interchange double negative lookahead for positive lookahead.

950

asked Nov 17 '09 14:11

themesandmodules

1 Answers

A negative lookahead says, at this position, the following regex can not match.

Let's take a simplified example:

a(?!b(?!c))  a      Match: (?!b) succeeds ac     Match: (?!b) succeeds ab     No match: (?!b(?!c)) fails abe    No match: (?!b(?!c)) fails abc    Match: (?!b(?!c)) succeeds

The last example is a double negation: it allows b followed by c. The nested negative lookahead becomes a positive lookahead: the c should be present.

In each example, only the a is matched. The lookahead is only a condition, and does not add to the matched text.

answered Sep 17 '22 15:09

Andomar

Related questions
                            
                                Replace all whitespace with a line break/paragraph mark to make a word list
                            
                                Regular expression for checking if capital letters are found consecutively in a string
                            
                                Regular Expression for getting everything after last slash [duplicate]
                            
                                c# regex matches example
                            
                                Regex: Use start of line/end of line signs (^ or $) in different context
                            
                                Regex to extract substring, returning 2 results for some reason
                            
                                RegEx: How can I match all numbers greater than 49?
                            
                                How to escape underscore character in PATINDEX pattern argument?
                            
                                linux find regex
                            
                                Rename files using regular expression in linux
                            
                                How to escape comma and double quote at same time for CSV file?
                            
                                Find everything between two XML tags with RegEx
                            
                                Use Python's string.replace vs re.sub
                            
                                jQuery javascript regex Replace <br> with \n
                            
                                Regular Expression to match valid dates
                            
                                How to check if the input string is a valid Regular expression?
                            
                                How to define a regex-matched string type in Typescript?
                            
                                Fast punctuation removal with pandas
                            
                                grunt (minimatch/glob) folder exclusion
                            
                                Carets in Regular Expressions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regular expression negative lookahead

Tags:

regex

lookahead

negative-lookahead

themesandmodules

People also ask

1 Answers

Andomar

Recent Activity

Donate For Us