Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing only numbers but keep the words like "3D" in R?

Tags:

r

tm

I have been coding text mining with R recently,but I have trouble dealing with data preprocessing. I have a string like this below:

"I want to buy 3D printer, but it costs 3000 dollars."

I want keep words "3D" but remove "3000", it should be like this below:

"I want to buy 3D printer, but it costs dollars."

I use corpus <- tm_map(corpus, removeNumbers) but this will remove all the numbers in the text, so I will have the term "D printer" in the result but it should be "3D printer".

Is there any possible way to fix this probelm? Thanks!

like image 479
John Chou Avatar asked Sep 26 '22 00:09

John Chou


1 Answers

We can use sub

gsub('3\\d+\\s', '', str1)

If this needs to be general,

gsub('\\b\\d+\\s', '', str1)
#[1] "I want to buy 3D printer, but it costs dollars."
like image 68
akrun Avatar answered Nov 15 '22 06:11

akrun