What is the difference between \\s|*
and \\s|[*]
in regular expression in r?
> gsub('\\s|*','','Aug 2013*')
[1] "Aug2013*"
> gsub('\\s|[*]','','Aug 2013*')
[1] "Aug2013"
What is the function of [ ]
here?
The first expression is invalid in the way you are using it, hence *
is a special character. If you want to use sub
or gsub
this way with special characters, you can use fixed = TRUE
parameter set.
This takes the string representing the pattern being search for as it is and ignores any special characters.
See Pattern Matching and Replacement
in the R
documentation.
x <- 'Aug 2013****'
gsub('*', '', x, fixed=TRUE)
#[1] "Aug 2013"
Your second expression is just using a character class []
for *
to avoid escaping, the same as..
x <- 'Aug 2013*'
gsub('\\s|\\*', '', x)
#[1] "Aug2013"
As far as the explanation of your first expression: \\s|*
\s whitespace (\n, \r, \t, \f, and " ")
| OR
And the second expression: \\s|[*]
\s whitespace (\n, \r, \t, \f, and " ")
| OR
[*] any character of: '*'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With