Goal is to make configuration and code readable after it has been exported from an application that stores this data in base64 encoded and gzip-ped format.
Example of a string with code
"H4sIAAAAAAAAAIWSS0vEMBSF9/0VIYvubHUnNGlhfIDCwOCMuCyhTeOVTBLzGPTfmzY60yKju+Tc8N1z7o2RQYBqmTESuGthaDuHXJpWTRknzsZfowK0DrSi+Ki4x4qrTPShB8fPu/uIaN3VGVsGB4s49BcnrDKGjsJlwaF5P0sMtxY/swLadBeN/6jda9eBjrxfwrytQvcMjLgI3zLI999FJEuYSGmHpNdp9Gk7xWyQXkilRbL2NXnGdS18twuTvQfsqJkqHU6x0n7KlY5MLX2UjYOyxZqacBFIeDZyxdGettusYiwn+h7X/QadBnadY7oNVaGDS8eoXciZMAyTlckNxh+Vyid//4Qv+y3JeLwIAAA=="
Decoded and gunzip-ped in a Linux shell with the command:
echo $1 | base64 -d | gunzip -c
Which results in:
plugin_applies_if_config<split>plugin_config=<?xml version="1.0" encoding="UTF-8"?>
<BusinessRule>
<BusinessPlugin BusinessRulePluginID="JavaScriptBusinessConditionWithBinds">
<Parameters>
<Parameter ID="Binds" Type="java.lang.String"><?xml version="1.0" encoding="UTF-8"?>
<BindMap/>
</Parameter>
<Parameter ID="ErrorMessages" Type="java.lang.String"></Parameter>
<Parameter ID="JavaScript" Type="java.lang.String">return false;</Parameter>
</Parameters>
</BusinessPlugin>
</BusinessRule>
<split>
Task accomplished. ...almost.
As i have several hundred of these strings, i want to perform similar commands as in the Linux shell in a script. And because i only know some R, i tried using R. I succesfully extracted the strings from the XML-document that was exported from the application and turned these in a data frame with columns id, name and code.
The following is a simplified example where i try to reproduce the Linux commands step by step.
encoded = "H4sIAAAAAAAAAIWSS0vEMBSF9/0VIYvubHUnNGlhfIDCwOCMuCyhTeOVTBLzGPTfmzY60yKju+Tc8N1z7o2RQYBqmTESuGthaDutBhDERcHXJpWTRknzsZfowK0DrSi+Ki4x4qrTPShB8fPu/uIaN3VGVsGB4s49BcnrDKGjsJlwaF5P0sMtxY/swLadBeN/6jda9eBjrxfwrytQvcMjLgI3zLI999FJEuYSGmHpNdp9Gk7xWyQXkilRbL2NXnGdS18twuTvQfsqJkqHU6x0n7KlY5MLX2UjYOyxZqacBFIeDZyxdGettusYiwn+h7X/QadBnadY7oNVaGDS8eoXciZMAyTlckNxh+Vyid//4Qv+y3JeLwIAAA=="
decoded = base64enc::base64decode(what=encoded)
# decoded = openssl::base64_decode(encoded)
# decoded = jsonlite::base64_dec(encoded)
# 3 times the same result
str(decoded)
# an array of raw-types. Maybe i need to convert to a string?
paste(decoded, collapse = "")
Doesn't look like the base64 decoded data in the Linux shell, but let's try to unzip...
decompressed <-
tryCatch({
memDecompress(from = paste(decoded, collapse = ""),
type = "gzip",
asChar = TRUE)
},
error = function(cond) {
message(cond)
return(NA)
})
# fails with "internal error -3 in memDecompress(2)"
(decompressed)
Clearly the input for 'gzip' is not what it expects. It must be some sort of binary string.
But how to get there? What am i doing wrong? Thanks for your advise!
To decode with base64 you need to use the --decode flag. With encoded string, you can pipe an echo command into base64 as you did to encode it. Using the example encoding shown above, let's decode it back into its original form. Provided your encoding was not corrupted the output should be your original string.
Base64 is an encoding, the strings you've posted are encoded. You can DECODE the base64 values into bytes (so just a sequence of bits). And from there, you need to know what these bytes represent and what original encoding they were represented in, if you wish to convert them again to a legible format.
you can use the expression Base64. encode(GZip. decompress($content)) in a mapper to decompress your zipped file and encode it into Base64 format.
The memDecompress
function was improved in R version 4.0.0 to work properly. You should now be able to do
memDecompress(base64enc::base64decode(what=encoded), "gzip", asChar=TRUE)
Previous versions were troublesome because they ignored standard headers. Here's a word around for older versions of R. Basically we create a raw stream of bytes and then use gzcon
to decompress them
con <- rawConnection(base64enc::base64decode(what=encoded))
readLines(gzcon(con))
close(con)
You will get a warning that there is an "incomplete final line" but that's just because it looks like there wasn't a new line at the end of the file. The data seems fine otherwise.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With