Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a WordArray?

I've been looking at crypto-js and its encoder converts to and from a WordArray. I looked up the documentation and couldn't find any explanation of what a WordArray might be.

To the best of my knowledge, there isn't even a typed array in JavaScript named WordArray, and neither is there a DataView on any of the typed arrays by that name.

I know what a WORD is in the Visual C++ parlance, but I am not sure what it means here.

Strange, all the threads (here, here and here) I found on crypto-js are using the word WordArray without anyone really asking what it is.

Could someone really tell me? Is it a Uint16Array? Or just another fancy word for a regular byte array (Uint8Array or an untyped Array of integral number values)?

like image 976
Water Cooler v2 Avatar asked Oct 23 '19 13:10

Water Cooler v2


People also ask

What is WordArray in Javascript?

cryptojs_wordarray.js * A WordArray object represents an array of 32-bit words. When you pass a string, * it's automatically converted to a WordArray encoded as UTF-8.

What is word array CryptoJS?

The first occurrence of the string WordArray (in your link labelled "documentation") states: "(The hash algorithms accept either strings or instances of CryptoJS. lib. WordArray.) *A WordArray object represents an array of 32-bit words.


1 Answers

The class is defined in core.js within the CryptoJS library:

/**
 * An array of 32-bit words.
 *
 * @property {Array} words The array of 32-bit words.
 * @property {number} sigBytes The number of significant bytes in this word array.
 */
var WordArray = C_lib.WordArray = Base.extend({

The (byte) values that are put in there are put in the most significant bits of the words (I've checked this against the source code).

For instance, if you would put the value "he" into it as UTF-8 (or Latin1 or ASCII) then you would get a one element array with the value 68_65_00_00 in it, and words set to the value 2. This is because UTF-8 encodes to 8-bit bytes and those bytes are grouped in the topmost 16 bits.


Generally (symmetric) cryptographic algorithms are specified to operate on bits. However, they are generally optimized to work either on 32 or 64 bit words because those are most optimal within 32 or 64 bit machines such as i86 or x64. So any library in any language will internally convert to words before the operations are performed.

Usually libraries define their operations to use bytes rather than words though. CryptoJS is a bit special in the sense that it operates on a buffer of words. That's kind of logical since JavaScript doesn't define byte arrays. It also skips a step, as you would otherwise have to convert from UTF-8 to bytes, and then to words again within the algorithm implementation.

CryptoJS also has a 64 bit word array present, undoubtedly for algorithms such as SHA-512 that are optimized for 64 bit operation.

like image 141
Maarten Bodewes Avatar answered Sep 28 '22 03:09

Maarten Bodewes