I use the gsub
function in R to remove unwanted characters in numbers. So I should remove from the strings every character that is not a number, .
, and -
. My problem is that the regular expression is not removing some non-numeric characters like d
, +
, and <
.
Below are my regular expression, the gsub
execution, and its output. How can I change the regular expression in order to achieve the desired output?
Current output:
gsub(pattern = '[^(-?(\\d*\\.)?\\d+)]', replacement = '', x = c('1.2<', '>4.5', '3+.2', '-1d0', '2aadddab2','1.3h'))
[1] "1.2<" ">4.5" "3+.2" "-1d0" "2ddd2" "1.3"
Desired output:
[1] "1.2" "4.5" "3.2" "-10" "22" "1.3"
Thank you.
We will remove non-alphanumeric characters by using str_replace_all() method. [^[:alnum:]] is the parameter that removes the non-alphanumeric characters.
In order to remove all non-numeric characters from a string, replace() function is used. replace() Function: This function searches a string for a specific value, or a RegExp, and returns a new string where the replacement is done.
How to remove a character or multiple characters from a string in R? You can either use R base function gsub() or use str_replace() from stringr package to remove characters from a string or text.
To remove dot and number at the end of the string, we can use gsub function. It will search for the pattern of dot and number at the end of the string in the vector then removal of the pattern can be done by using double quotes without space.
Simply use
gsub("[^0-9.-]", "", x)
You can in case of multiple -
and .
have a second regEx dealing with that.
If you struggle with it, open a new question.
(Make sure to change .
with ,
if needed)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With