Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match string, but only if not preceded by other string

Tags:

regex

r

Say I have vector of strings:

v = c("SPX.Close", "AAPL.Low", "Lo", "LowPrice", "PriceLow", "low")

How to write regex that would match all strings resembling phrase "low"?

grep("lo", v, ignore.case=T) # 1 2 3 4 5 6 7

This matches the first string too, which I don't want.

How to match lo only if not preceded by letter c ?

like image 809
Daniel Krizian Avatar asked Jul 31 '14 10:07

Daniel Krizian


People also ask

What is\\ in regex?

To match a literal space, you'll need to escape it: "\\ " . This is a useful way of describing complex regular expressions: phone <- regex(" \\(? #

What is string regex?

Regular Expressions or Regex (in short) in Java is an API for defining String patterns that can be used for searching, manipulating, and editing a string in Java. Email validation and passwords are a few areas of strings where Regex is widely used to define the constraints. Regular Expressions are provided under java.

What is b regex?

The \b metacharacter matches at the beginning or end of a word.

How do you match a space in regex?

\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.


2 Answers

Negative Lookbehind (PCRE in R)

R uses the PCRE engine, which supports lookbehind. Do this:

grep("(?<!c)lo", subject, perl=TRUE, value=TRUE, ignore.case=TRUE);

The negative lookbehind (?<!c) asserts that what precedes the current position is not a c

Option 2: Check for Capital Letter, Turn On Case-Insensitivity Inline

Given your input, a more general option would be to assert that lo is not preceded by a capital letter:

grep("(?<![A-Z])(?i)lo", subject, perl=TRUE, value=TRUE);

For this option, we use the inline modifier (?i) to turn on case-insensitivity, but only after we have checked that no capital letters precede our position.

Reference

  • Inline Modifiers
  • Mastering Lookahead and Lookbehind
  • Lookahead and Lookbehind Zero-Length Assertions
like image 57
zx81 Avatar answered Sep 29 '22 04:09

zx81


You can use a negative lookbehind:

grep("(?<!C)lo", v, ignore.case=T, perl=T) 

That will make sure that the string isn't preceded by C.

like image 33
Mowday Avatar answered Sep 29 '22 04:09

Mowday