Not sure I am using the right terminology here, but I need the print
or deparse
methods use C notation (e.g. "\x05"
instead of "\005"
) when escaping bytes out of the regular character set.
x <- "This is a \x05 symbol"
print(x)
[1] "This is a \005 symbol"
Is there a native way to accomplish this?
I need this for generating BSON: http://bsonspec.org/#/specification. All of the examples explicitly use \x05
notation.
Hacking into the internals of print
seems a bad idea. Instead I think you should do the string escaping yourself, and eventually use cat
to print the string without any extra escaping.
You can use encodeString
to do the initial escaping, gregexpr
to identify octal \0..
escapes, strtoi
to convert strings representing octal numbers to those numbers, sprintf
to print numbers in hexadecimal, and regenmatches
to operate on the matched parts. The whole process would look something like this:
inputString <- "This is a \005 symbol. \x13 is \\x13."
x <- encodeString(inputString)
m <- gregexpr("\\\\[0-3][0-7][0-7]", x)
charcodes <- strtoi(substring(regmatches(x, m)[[1]], 2, 4), 8)
regmatches(x, m) <- list(sprintf("\\x%02x", charcodes))
cat(x, "\n")
Note that this approach will convert octal escapes like \005
to hexadecimal escapes like \x05
, but other escape sequences like \t
or \a
won't be affected by this. You might need more code to deal with those as well, but the above should contain all the ingredients you need.
Note that the BSON specification you refer to almost certainly meant raw bytes, so as long as your string contains a character with code 5, which you can write as "\x05"
in your input, and you write that string to the desired output in binary mode, it shouldn't matter at all how R prints that string to you. After all, octal \005
and hexadecimal \x05
are just two representations of the same byte you'll write.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With