I want to match a regular expression special character, \^$.?*|+()[{
. I tried:
x <- "a[b" grepl("[", x) ## Error: invalid regular expression '[', reason 'Missing ']''
(Equivalently stringr::str_detect(x, "[")
or stringi::stri_detect_regex(x, "[")
.)
Doubling the value to escape it doesn't work:
grepl("[[", x) ## Error: invalid regular expression '[[', reason 'Missing ']''
Neither does using a backslash:
grepl("\[", x) ## Error: '\[' is an unrecognized escape in character string starting ""\["
How do I match special characters?
Some special cases of this in questions that are old and well written enough for it to be cheeky to close as duplicates of this:
Escaped Periods In R Regular Expressions
How to escape a question mark in R?
escaping pipe ("|") in a regex
(?:...) A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.
If you are having a string with special characters and want's to remove/replace them then you can use regex for that. Use this code: Regex. Replace(your String, @"[^0-9a-zA-Z]+", "")
R treats backslashes as escape values for character constants. (... and so do regular expressions. Hence the need for two backslashes when supplying a character argument for a pattern. The first one isn't actually a character, but rather it makes the second one into a character.) You can see how they are processed using cat
.
y <- "double quote: \", tab: \t, newline: \n, unicode point: \u20AC" print(y) ## [1] "double quote: \", tab: \t, newline: \n, unicode point: €" cat(y) ## double quote: ", tab: , newline: ## , unicode point: €
Further reading: Escaping a backslash with a backslash in R produces 2 backslashes in a string, not 1
To use special characters in a regular expression the simplest method is usually to escape them with a backslash, but as noted above, the backslash itself needs to be escaped.
grepl("\\[", "a[b") ## [1] TRUE
To match backslashes, you need to double escape, resulting in four backslashes.
grepl("\\\\", c("a\\b", "a\nb")) ## [1] TRUE FALSE
The rebus
package contains constants for each of the special characters to save you mistyping slashes.
library(rebus) OPEN_BRACKET ## [1] "\\[" BACKSLASH ## [1] "\\\\"
For more examples see:
?SpecialCharacters
Your problem can be solved this way:
library(rebus) grepl(OPEN_BRACKET, "a[b")
You can also wrap the special characters in square brackets to form a character class.
grepl("[?]", "a?b") ## [1] TRUE
Two of the special characters have special meaning inside character classes: \
and ^
.
Backslash still needs to be escaped even if it is inside a character class.
grepl("[\\\\]", c("a\\b", "a\nb")) ## [1] TRUE FALSE
Caret only needs to be escaped if it is directly after the opening square bracket.
grepl("[ ^]", "a^b") # matches spaces as well. ## [1] TRUE grepl("[\\^]", "a^b") ## [1] TRUE
rebus
also lets you form a character class.
char_class("?") ## <regex> [?]
If you want to match all punctuation, you can use the [:punct:]
character class.
grepl("[[:punct:]]", c("//", "[", "(", "{", "?", "^", "$")) ## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
stringi
maps this to the Unicode General Category for punctuation, so its behaviour is slightly different.
stri_detect_regex(c("//", "[", "(", "{", "?", "^", "$"), "[[:punct:]]") ## [1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE
You can also use the cross-platform syntax for accessing a UGC.
stri_detect_regex(c("//", "[", "(", "{", "?", "^", "$"), "\\p{P}") ## [1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE
Placing characters between \\Q
and \\E
makes the regular expression engine treat them literally rather than as regular expressions.
grepl("\\Q.\\E", "a.b") ## [1] TRUE
rebus
lets you write literal blocks of regular expressions.
literal(".") ## <regex> \Q.\E
Regular expressions are not always the answer. If you want to match a fixed string then you can do, for example:
grepl("[", "a[b", fixed = TRUE) stringr::str_detect("a[b", fixed("[")) stringi::stri_detect_fixed("a[b", "[")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With