I have Unicode newline characters in a string in which I need to remove.
These characters can be carriage return \U000D
, newline \U000A
, line separator or paragraph separator.
I am able to remove the carriage return and newline characters by using the following.
gsub("\\s", "", x)
Like I said this works fine for those Unicode characters, but I am not able to remove the the line separator \U2028
or paragraph separator \U2029
characters.
Is there another way to do this?
You can switch on PCRE
using perl=T
and utilize the handy escape sequence (\R
)
> x <- 'foo\U000D\U000A bar\U2029 baz\U2028\U2029'
> x
## [1] "foo\r\n bar\u2029 baz\u2028\u2029"
> gsub('\\R', '', x, perl=T)
## [1] "foo bar baz"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With