I have an ArrayBuffer which is returned by reading memory using Frida. I'm converting the ArrayBuffer to a string, then back to an ArrayBuffer using TextDecoder and TextEncoder, however the result is being altered in the process. The ArrayBuffer length after decoding and re-encoding always comes out larger. Is there a character decoding in an expansive fashion?
How can I decode an ArrayBuffer to a String, then back to an ArrayBuffer without losing integrity?
Example code:
var arrayBuff = Memory.readByteArray(pointer,2000); //Get a 2,000 byte ArrayBuffer
console.log(arrayBuff.byteLength); //Always returns 2,000
var textDecoder = new TextDecoder("utf-8");
var textEncoder = new TextEncoder("utf-8");
//Decode and encode same data without making any changes
var decoded = textDecoder.decode(arrayBuff);
var encoded = textEncoder.encode(decoded);
console.log(encoded.byteLength); //Fluctuates between but always greater than 2,000
TextDecoder
and TextEncoder
are designed to work with text.
To convert an arbitrary byte sequence into a string and back, it's best to treat each byte as a single character.
var arrayBuff = Memory.readByteArray(pointer,2000); //Get a 2,000 byte ArrayBuffer
console.log(arrayBuff.byteLength); //Always returns 2,000
//Decode and encode same data without making any changes
var decoded = String.fromCharCode(...new Uint8Array(arrayBuff));
var encoded = Uint8Array.from([...decoded].map(ch => ch.charCodeAt())).buffer;
console.log(encoded.byteLength);
The decoded
string will have exactly the same length as the input buffer and can be easily manipulated with regular expression, string methods, etc. But beware that Unicode characters that occupy two or more bytes in memory (e.g. "π") won't be recognizable anymore, as they will result in the concatenation of the characters corresponding to the code of each individual byte.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With