I'm using the gsub function in R to return occurrences of my pattern (reference numbers) on a list of text.  This works great unless no match is found, in which case I get the entire string back, instead of an empty string.  Consider the example:
data <- list("a sentence with citation (Ref. 12)",
             "another sentence without reference")
sapply(data, function(x) gsub(".*(Ref. (\\d+)).*", "\\1", x))
Returns:
[1] "Ref. 12"                            "another sentence without reference"
But I'd like to get
[1] "Ref. 12"                            ""
Thanks!
I'd probably go a different route, since the sapply doesn't seem necessary to me as these functions are vectorized already:
fun <- function(x){
    ind <- grep(".*(Ref. (\\d+)).*",x,value = FALSE)
    x <- gsub(".*(Ref. (\\d+)).*", "\\1", x)
    x[-ind] <- ""
    x
}
fun(data)
                        according to the documentation, this is a feature of gsub it returns the input string if there are no matches to the supplied pattern matches returns the entire string. 
here, I use the function grepl first to return a logical vector of the presence/absence of the pattern in the given string:
ifelse(grepl(".*(Ref. (\\d+)).*", data), 
      gsub(".*(Ref. (\\d+)).*", "\\1", data), 
      "")
embedding this in a function:
mygsub <- function(x){
     ans <- ifelse(grepl(".*(Ref. (\\d+)).*", x), 
              gsub(".*(Ref. (\\d+)).*", "\\1", x), 
              "")
     return(ans)
}
mygsub(data)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With