Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove numbers from string in R

Tags:

r

gsub

I'm trying to remove all the number except 67 from string by using the function gsub.

For example:

txt <- "A function 147832 for 67cleaning 67 data 6 7"

Desire output:

txt <- "A function for 67cleaning 67 data"

I've tried txt = gsub("[[:digit:]]", "", txt), but it will remove all the numbers.

like image 586
velvetrock Avatar asked Nov 17 '15 11:11

velvetrock


2 Answers

it's not super elegant but you can do it in three steps:

 tmp <- gsub("67", "XX", "A function 147832 for 67cleaning 67 data 6 7")
 tmp <- gsub("\\d+", "", tmp)
 tmp <- gsub("XX", "67", tmp)
 tmp
 #"A function  for 67cleaning 67 data  "

first substitute all instances of 67 with a marker (say, XX), then delete all other remaining numbers, finally sub 67 back in.

like image 192
stas g Avatar answered Sep 28 '22 09:09

stas g


You could do this

x = unlist(strsplit(txt, split = '\\s+')) # split your string
paste0(x[Reduce(`|`, lapply(c('[A-Za-z]', '67'), grepl, x))], collapse = ' ') # use the list of regular expression to match the required pattern and put them all together

#[1] "A function for 67cleaning 67 data"
like image 42
Veerendra Gadekar Avatar answered Sep 28 '22 07:09

Veerendra Gadekar