Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a Blob or a File from JavaScript binary string changes the number of bytes?

I have been playing with a few JS encryption libraries (CryptoJS, SJCL) and discovered problems related to the Blob/File APIs and JavaScript "binary strings".

I realized that the encryption isn't even really relevant, so here's a much simplified scenario. Simply read a file in using readAsBinaryString and then create a Blob:

>>> reader.result
"GIF89a����ÿÿÿÿÿÿ!þCreated with GIMP�,�������D�;"
>>> reader.result.length
56
>>> typeof reader.result
"string"
>>> blob = new Blob([reader.result], {type: "image/gif"})
Blob { size=64, type="image/gif", constructor=function(), more...}

I have created a JSFiddle that will basically do the above: it simply reads any arbitrary file, creates a blob from it, and outputs the length vs size: http://jsfiddle.net/6L82t/1/

It appears that, when creating the Blob from the "binary (javascript) string", something with character encoding ends up munging the result.

If a non-binary file is used, you will see that the lengths of the Blob and the original binary string are identical.

So there is something that happens when trying to create a Blob/File from a non-plaintext Javascript string, and I need whatever that is to not happen. I think it may have something to do with the fact that JS strings are UTF-16?

There's a (maybe) related thread here: HTML5 File API read as text and binary

Do I need to possibly take the decrypted results (UTF-16) and "convert" them to UTF-8 before putting them in a Blob/File?

Working with someone in #html5 on Freenode, we determined that if you read an ArrayBuffer directly and then create the blob from that by first using a Uint8Array, the bytes work out just fine. You can see a fiddle that essentially does that here: http://jsfiddle.net/GH7pS/4/

The issue is, at least in my scenario, I am going to end up with a binary string and would like to figure out how to directly convert that into a Blob so that I can then use html5's download to allow the user to click to download the blob directly.

Thanks!

like image 890
Erik Jacobs Avatar asked May 21 '14 22:05

Erik Jacobs


People also ask

What is Blob in JavaScript?

A Blob is an opaque reference to, or handle for, a chunk of data. The name comes from SQL databases, where it means “Binary Large Object.” In JavaScript, Blobs often represent binary data, and they can be large, but neither is required: a Blob could also represent the contents of a small text file.

How does JavaScript handle binary data?

JavaScript can handle binary data via typed arrays. And here is a library for dealing with binary files, that you can use as a reference point for your application.

What is a Blob file?

BLOB stands for a “Binary Large Object,” a data type that stores binary data. Binary Large Objects (BLOBs) can be complex files like images or videos, unlike other data strings that only store letters and numbers.


1 Answers

It appears that, when creating the Blob from the "binary (javascript) string", something with character encoding ends up munging the result.

Yes. That post you read explains well how a "binary string" is constituted.

The Blob constructor in contrast does

  1. Let s be the result of converting [the string] to a sequence of Unicode characters using the algorithm for doing so in WebIDL.
  2. Encode s as UTF-8 and append the resulting bytes to [the blob].

We determined that if you read an ArrayBuffer directly and then create the blob from that by first using a Uint8Array, the bytes work out just fine.

Yes, that's how it is supposed to work. Just do the encryption on a Typed Array where you deal with the bytes individually, not on some string.

The issue is, at least in my scenario, I am going to end up with a binary string

Again: Try not to. binary strings are deprecated.

I would like to figure out how to directly convert a binary string into a Blob. Do I need to possibly take the decrypted results (UTF-16) and "convert" them to UTF-8 before putting them in a Blob/File?

No, better don't try to do any string conversions. Instead, construct a Uint8Array(Uint8Array) for the bytes that you want to get from the binary string.

This should do it (untested):

var bytes = new Uint8Array(str.length);
for (var i=0; i<str.length; i++)
    bytes[i] = str.charCodeAt(i);
like image 200
Bergi Avatar answered Oct 26 '22 12:10

Bergi