I'm using the gsub
function in R to return occurrences of my pattern (reference numbers) on a list of text. This works great unless no match is found, in which case I get the entire string back, instead of an empty string. Consider the example:
data <- list("a sentence with citation (Ref. 12)",
"another sentence without reference")
sapply(data, function(x) gsub(".*(Ref. (\\d+)).*", "\\1", x))
Returns:
[1] "Ref. 12" "another sentence without reference"
But I'd like to get
[1] "Ref. 12" ""
Thanks!
I'd probably go a different route, since the sapply
doesn't seem necessary to me as these functions are vectorized already:
fun <- function(x){
ind <- grep(".*(Ref. (\\d+)).*",x,value = FALSE)
x <- gsub(".*(Ref. (\\d+)).*", "\\1", x)
x[-ind] <- ""
x
}
fun(data)
according to the documentation, this is a feature of gsub
it returns the input string if there are no matches to the supplied pattern matches returns the entire string.
here, I use the function grepl
first to return a logical vector of the presence/absence of the pattern in the given string:
ifelse(grepl(".*(Ref. (\\d+)).*", data),
gsub(".*(Ref. (\\d+)).*", "\\1", data),
"")
embedding this in a function:
mygsub <- function(x){
ans <- ifelse(grepl(".*(Ref. (\\d+)).*", x),
gsub(".*(Ref. (\\d+)).*", "\\1", x),
"")
return(ans)
}
mygsub(data)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With