Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing the specific values in columns of data frame using gsub in R

Tags:

regex

r

gsub

I have data.frame as follows

> df
ID      Value
A_001   DEL-1:7:35-8_1 
A_002   INS-4l:5_74:d
B_023   0 
C_891   2
D_787   8
E_865   DEL-3:65:1s:b

I would like replace all the values in the column Value that starts with DEL and INS with nothing. I mean i would like get the output as follows

> df
ID      Value
A_001   
A_002   
B_023   0 
C_891   2
D_787   8
E_865   

I tried to achieve this using gsub in R using following code but it didnt work

gsub(pattern="(^([DEL|INS]*)",replacement="",df)

Could anyone guide me how to achieve the desired output.

Thanks in advance.

like image 893
Carol Avatar asked Aug 17 '15 13:08

Carol


2 Answers

Just remove the character class and add .* next to that group. sub alone would do this job.

df$value <- sub("^(DEL|INS).*", "", df$value)

Inside a character class, each char would be treated speartely not as a whole string. So [DEL] would match a single character from the given list, it may be D or E or L .

like image 162
Avinash Raj Avatar answered Oct 08 '22 22:10

Avinash Raj


First letter is not digital:

df$value <- gsub("^\\D.*", "", df$value)

Or there is '-' in delete value:

df$value <- gsub(".*-.*", "", df$value)
like image 44
Shenglin Chen Avatar answered Oct 08 '22 23:10

Shenglin Chen