Say I have vector of strings:
v = c("SPX.Close", "AAPL.Low", "Lo", "LowPrice", "PriceLow", "low")
How to write regex that would match all strings resembling phrase "low"?
grep("lo", v, ignore.case=T) # 1 2 3 4 5 6 7
This matches the first string too, which I don't want.
How to match lo
only if not preceded by letter c
?
To match a literal space, you'll need to escape it: "\\ " . This is a useful way of describing complex regular expressions: phone <- regex(" \\(? #
Regular Expressions or Regex (in short) in Java is an API for defining String patterns that can be used for searching, manipulating, and editing a string in Java. Email validation and passwords are a few areas of strings where Regex is widely used to define the constraints. Regular Expressions are provided under java.
The \b metacharacter matches at the beginning or end of a word.
\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
R uses the PCRE engine, which supports lookbehind. Do this:
grep("(?<!c)lo", subject, perl=TRUE, value=TRUE, ignore.case=TRUE);
The negative lookbehind (?<!c)
asserts that what precedes the current position is not a c
Option 2: Check for Capital Letter, Turn On Case-Insensitivity Inline
Given your input, a more general option would be to assert that lo
is not preceded by a capital letter:
grep("(?<![A-Z])(?i)lo", subject, perl=TRUE, value=TRUE);
For this option, we use the inline modifier (?i)
to turn on case-insensitivity, but only after we have checked that no capital letters precede our position.
Reference
You can use a negative lookbehind:
grep("(?<!C)lo", v, ignore.case=T, perl=T)
That will make sure that the string isn't preceded by C.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With