Given:
test <- data.frame(Speed=c("2 Mbps", "10 Mbps"))
Why does this regex match the following values:
grepl("[0-9]*Mbps$", test[,"Speed"], ignore.case=TRUE)
but fails to match those below:
grepl("^[0-9]*Mbps$", test[,"Speed"], ignore.case=TRUE)
The ^
(beginning of line/string) character is causing the issue, but why?
The ^[0-9]*Mbps$
regex looks for a number at the beginning and then for Mbps
at the end. And since there are spaces in-between, there is no match. To match the strings, use ^[0-9]*\\s*Mbps$
.
test <- data.frame(Speed=c("2 Mbps", "10 Mbps"))
grepl("^[0-9]*\\s*Mbps$", test[,"Speed"], ignore.case=TRUE)
Output of the demo program:
[1] TRUE TRUE
[0-9]*Mbps$
matches just Mbps
at the end of each item because the [0-9]*
can match an empty string due to the *
quantifier.
Because a space is missing in the regex;
"^[0-9]* Mbps$"
or "^[0-9]*\\s*Mbps$"
would match the inputs.
"[0-9]*Mbps$"
matches (not necessarily from the beginning of the string) "zero occurences of digit-characters, followed by 'Mbps' and end of string".
"^[0-9]*Mbps$"
doesn't match the inputs, because it requires the input to start with zero-or-more digits, then 'Mbps' (no space!), then end of string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With