Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Buffer comparison in Node.js

Tags:

I'm new in Node.js. There aren't Buffer comparison and I should use modules like buffertools for these feature.

But I see a pretty strange behaviour when I compare Buffer objects in pure Node.

> var b1 = new Buffer([170]); > var b2 = new Buffer([171]); > b1 <Buffer aa> > b2 <Buffer ab> > b1 < b2 false > b1 > b2 false > b1 == b2 false 

and

> var b1 = new Buffer([10]); > var b2 = new Buffer([14]); > b1 <Buffer 0a> > b2 <Buffer 0e> > b1 > b2 false > b1 < b2 true > b1 == b2 false 

What actually happens under the hood?

like image 881
Sergey Avatar asked Dec 09 '12 18:12

Sergey


People also ask

How do you compare two buffers?

memcmp() — Compare Buffers The memcmp() function compares the first count bytes of buf1 and buf2 . This example compares first and second arguments passed to main() to determine which, if either, is greater.

Which function is used to equate two node JS buffers?

Comparing two buffers is easy. Node. js' Buffer class has a static function compare() that returns 0 if two buffers are equal.

What is a buffer in Node JS?

What Are Buffers? The Buffer class in Node. js is designed to handle raw binary data. Each buffer corresponds to some raw memory allocated outside V8. Buffers act somewhat like arrays of integers, but aren't resizable and have a whole bunch of methods specifically for binary data.

How do I decode a buffer in Node JS?

In Node. js, the Buffer. toString() method is used to decode or convert a buffer to a string, according to the specified character encoding type. Converting a buffer to a string is known as encoding, and converting a string to a buffer is known as decoding.


2 Answers

That's how the comparison operators work on objects:

var a = {}, b = {}; a === b; //false a == b; //false a > b; //false a < b; //false  var c = { valueOf : function () { return 0; } }; var d = { valueOf : function () { return 1; } }; c === d; //false c == d; //false c > d; //false c < d; //true 

Under the hood

(sort of)

Part 1 : Equality

This is the easiest part. Both abstract equality (==, spec) and strict equality (===, spec) check if you're referring to the same object (sort of comparing references). In this case, they are obviously not, so they answer is false (== spec step 10, === spec step 7).

Therefore, in both cases:

b1 == b2 //false b1 === b2 //false 

Part 2: The Comparison strikes back

Here comes the interesting part. Let's look at how the relational operators (< and >) are defined. Let's follow the call chain in the two cases.

x = b1 //<Buffer aa> y = b2 //<Buffer ab>  //11.8.5 The Abstract Relational Comparison Algorithm (http://es5.github.com/#x11.8.5) Let px be the result of calling ToPrimitive(x, hint Number). Let py be the result of calling ToPrimitive(y, hint Number).  //9.1 ToPrimitive (http://es5.github.com/#x9.1) InputType is Object, therefore we call the internal [[DefaultValue]] method with hint Number.  //8.12.8 [[DefaultValue]] (hint) http://es5.github.com/#x8.12.8 We try and fetch the object's toString method. If it's defined, call it. 

And here we've reached the climax: What's a buffer's toString method? The answer lies deep inside node.js internals. If you want, have at it. What we can find out trivially is by experimentation:

> b1.toString() '�' > b2.toString() '�' 

okay, that wasn't helpful. You'll notice that in the Abstract Relational Comparison Algorithm (what a big fancy name for <), there's a step for dealing with strings. It just converts them to their numeric value - the char codes. Let's do that:

> b1.toString().charCodeAt(0) 65533 > b2.toString().charCodeAt(0) 65533 

65533 is an important number. It's the sum of two squares: 142^2 + 213^2. It also happens to be the Unicode Replacement Character, a character signifying "I have no idea what happened". That's why its hexadecimal equivalent is FFFD.

Obviously, 65533 === 65533, so:

b1 < b2 //is b1.toString().charCodeAt(0) < b2.toString().charCodeAt(0) //is 65533 < 65533 //false b1 > b2 //following same logic as above, false 

And that's that.

Dude, what the hell?

Okay, this must've been confusing since my efforts of explanation haven't been well thought through. To recap, here's what happened:

  1. You created a buffer. Benjamin Gruenbaum helped me recreate your test case by doing:

    var b1 = new Buffer([170]), b2 = new Buffer([171]);

  2. When outputting to console, the values are turned into their hex equivalent (see Buffer#inspect):

    170..toString(16) === 'aa'

    171..toString(16) === 'ab'

  3. However, internally, they represented invalid characters (since it's not hex encoding; again, you're free to delve into the implementation nitty gritty, I won't (oh the irony)). Therefore, when converted to a string, they were represented with the Unicode replacement character.

  4. Since they're different objects, any equality operator will return false.

  5. However, due to the way less-than and greater-than work, they were turned into strings (and then to numbers) for comparison. In light of point #3, that's the same value; therefore, they cannot be less-than or greater-than each other, leading to false.

Finally, just to put a smile on your face:

b1 <= b2 //true b1 >= b2 //true 
like image 114
Zirak Avatar answered Oct 04 '22 12:10

Zirak


There's already an accepted answer but I thought I might still as well chime in with a remark since I don't find the accepted answer particularly clear or helpful. It's even incorrect if only because it answers questions that the OP didn't ask. So let's boil that down:

> var b1 = new Buffer([170]); > var b2 = new Buffer([171]); > b1 < b2 > b1 > b2 > b1 == b2 

All that is asked for is: "how do I perform equivalence and less than / greater than comparison (a.k.a. (total) ordering) on buffers".

The answer is:

  • either do it manually by stepping through all the bytes of both buffers and perform a comparison between the corresponding bytes, e.g. b1[ idx ] === b2[ idx ],

  • or use Buffer.compare( b1, b2 ) which gives you one of -1, 0, or +1, depending on whether the first buffer would sort before, exactly like, or after the second (sorting a list d that contains buffers is then as easy as d.sort( Buffer.compare )).

Observe I use === in my first example; my frequent comments on this site concerning the abuse of == in JavaScript should make it abundantly clear why that is so.

like image 30
flow Avatar answered Oct 04 '22 12:10

flow