Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

stringr::str_starts returns TRUE when it shouldn't

Tags:

r

stringr

I am trying to detect whether a string starts with either of the provided strings (separated by | )

name = "KKSWAP"
stringr::str_starts(name, "RTT|SWAP")

returns TRUE, but

str_starts(name, "SWAP|RTT")

returns FALSE

This behaviour seems wrong, as KKSWAP doesn't start with "RTT" or "SWAP". I would expect this to be false in both above cases.

like image 243
Brandon Avatar asked Nov 02 '25 12:11

Brandon


2 Answers

The reason can be found in the code of the function :

function (string, pattern, negate = FALSE) 
{
    switch(type(pattern), empty = , bound = stop("boundary() patterns are not supported."), 
        fixed = stri_startswith_fixed(string, pattern, negate = negate, 
            opts_fixed = opts(pattern)), coll = stri_startswith_coll(string, 
            pattern, negate = negate, opts_collator = opts(pattern)), 
        regex = {
            pattern2 <- paste0("^", pattern)
            attributes(pattern2) <- attributes(pattern)
            str_detect(string, pattern2, negate)
        })
}

You can see, it pastes '^' in front of the parttern, so in your example it looks for '^RR|SWAP' and finds 'SWAP'.

If you want to look at more than one pattern you should use a vector:

name <- "KKSWAP"
stringr::str_starts(name, c("RTT","SWAP"))
# [1] FALSE FALSE

If you want just one answer, you can combine with any()

name <- "KKSWAP"
stringr::str_starts(name, c("RTT","SWAP"))
# [1] FALSE

The advantage of stringr::str_starts() is the vectorisation of the pattern argument, but if you don't need it grepl('^RTT|^SWAP', name), as suggested by TTS, is a good base R alternative.

Alternatively, the base function startsWith() suggested by jpsmith offers both the vectorized and | options :

startsWith(name, c("RTT","SWAP"))
# [1] FALSE FALSE

startsWith(name, "RTT|SWAP")
# [1] FALSE
like image 107
Salix Avatar answered Nov 05 '25 03:11

Salix


I'm not familiar with the stringr version, but the base R version startsWith returns your desired result. If you don't have to use stringr, this may be a solution:

startsWith(name, "RTT|SWAP")
startsWith(name, "SWAP|RTT")
startsWith(name, "KK")

# > startsWith(name, "RTT|SWAP")
# [1] FALSE
# > startsWith(name, "SWAP|RTT")
# [1] FALSE
# > startsWith(name, "KK")
# [1] TRUE

like image 29
jpsmith Avatar answered Nov 05 '25 01:11

jpsmith



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!