Given input vector (iv)
iv <- c(.10,.15,"hello","."," . ",". ")
I'm using:
out <- sub(regexp,NA,iv)
I want output vector like this:
.10,.15,"hello",NA,NA,NA
but, don't know how to form the regexp to get what I need. Thanks in advance.
What you're looking for is negative lookahead
in regular expressions. You want to check for .
not followed by a number (0-9)
and replace them with NA
. If this logic is what you want, then it can be implemented in 1 line as follows:
gsub("\\.(?![0-9])", NA, iv, perl=T)
# [1] "0.1" "0.15" "hello" NA NA NA
Logic: search for a dot that is not followed by a number and replace them with NA
.
if you want to replace the values with NA
then you will want to use some form of the assignment operators.
A simple approach:
iv[gsub(" ", "", iv)=="."] <- NA
quick explanation:
If the strings to replace were all the same (ie, "."
), then you could simply call
iv[ iv=="."] <- NA
.
However, in order to catch all the extra spaces, you can either search for the myriad "." combinations making sure to exclude the .10
, .15
etc, or instead
you can remove all the spaces and then you have the simpler situation where you can use ==
.
Incidentally, if you want to search for a period in regex in R, you need to escape the period for regex \.
and then you need to escape the escape for R
, \\.
Edit: Note that the line above does not permanently remove the spaces from iv
. Take a look at gsub(" ", "", iv)=="."
This returns a vector of T/F, which in turn is being used to filter iv
. Other than the NA
values, iv
remains unchanged.
EDIT #2: If you want the changes to be saved to a different vector, you can use the following:
out <- iv
out[gsub(" ", "", iv)=="."] <- NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With