I am currently in the process of doing some text analysis. I want to keep only alphanumeric characters but for some reason I am having trouble removing some pesky characters that I don't consider alphanumeric. Here's an example of what I am dealing with:
letters <- "ՄĄՄdasdas"
letters <- gsub("[^[:alnum:]]", "",letters)
letters
> "ՄĄՄdasdas"
What am I doing wrong here?
@konvas shows you how to use gsub
correctly in this situation. The problem with your attempt is that those non-ASCII characters are considered alphabetic characters in your locale. Another option is to use iconv
:
iconv(letters, to='ASCII', sub='')
Try gsub("[^A-Za-z0-9]", "", letters)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With