I was attempting to replace what I thought was a standard dash using gsub
. The code I was testing was:
gsub("-", "ABC", "reported – estimate")
This does nothing, though. I copied and pasted the dash into http://unicodelookup.com/#–/1 and it seems to be a en dash. That site provides the hex, dec etc codes for an en dash and I've been trying to replace the en dash but am not having luck. Suggestions?
(As a bonus, if you can tell me if there is a function to identify special characters that would be helpful).
I'm not sure if SO's code formatting will change the dash format so here is the dash I'm using (–).
You can replace the en-dash by just specifying it in the regex pattern.
gsub("–", "ABC", "reported – estimate")
You can match all hyphens, en- and em-dashes with
gsub("[-–—]", "ABC", "reported – estimate — more - text")
See IDEONE demo
To check if there are non-ascii characters in a string, use
> s = "plus ça change, plus c'est la même chose"
> gsub("[[:ascii:]]+", "", s, perl=T)
[1] "çê"
See this IDEONE demo
You will either get an empty result (if a string only consists of "word" characters and whitespace), or - as here - some "special" characters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With