Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to replace a period (.) when appearing alone or with spaces but not a number with a decimal point r

Tags:

regex

r

Given input vector (iv)

iv <- c(.10,.15,"hello","."," . ",". ")

I'm using:

out <- sub(regexp,NA,iv)

I want output vector like this:

.10,.15,"hello",NA,NA,NA

but, don't know how to form the regexp to get what I need. Thanks in advance.

like image 406
user1895891 Avatar asked Feb 24 '13 08:02

user1895891


2 Answers

What you're looking for is negative lookahead in regular expressions. You want to check for . not followed by a number (0-9) and replace them with NA. If this logic is what you want, then it can be implemented in 1 line as follows:

gsub("\\.(?![0-9])", NA, iv, perl=T)
# [1] "0.1"   "0.15"  "hello" NA      NA      NA     

Logic: search for a dot that is not followed by a number and replace them with NA.

like image 164
Arun Avatar answered Sep 19 '22 14:09

Arun


if you want to replace the values with NA then you will want to use some form of the assignment operators.

A simple approach:

 iv[gsub(" ", "", iv)=="."] <- NA

quick explanation:

If the strings to replace were all the same (ie, "."), then you could simply call iv[ iv=="."] <- NA.

However, in order to catch all the extra spaces, you can either search for the myriad "." combinations making sure to exclude the .10, .15 etc, or instead you can remove all the spaces and then you have the simpler situation where you can use ==.

Incidentally, if you want to search for a period in regex in R, you need to escape the period for regex \. and then you need to escape the escape for R, \\.


Edit: Note that the line above does not permanently remove the spaces from iv. Take a look at gsub(" ", "", iv)=="." This returns a vector of T/F, which in turn is being used to filter iv. Other than the NA values, iv remains unchanged.

EDIT #2: If you want the changes to be saved to a different vector, you can use the following:

 out <- iv
 out[gsub(" ", "", iv)=="."] <- NA
like image 39
Ricardo Saporta Avatar answered Sep 21 '22 14:09

Ricardo Saporta