Preserving the floating point & addition of a bitwise operation in javascript

Question

I am trying to understand the way to add, subtract, divide, and multiply by operating on the bits.

It is necessary to do some optimizing in my JavaScript program due to many calculations running after an event has happened.

By using the code below for a reference I am able to understand that the carry holds the &ing value. Then by doing the XOr that sets the sum var to the bits that do not match in each n1 / n2 variable.

Here is my question.;) What does shifting the (n1 & n2)<<1 by 1 do? What is the goal by doing this? As with the XOr it is obvious that there is no need to do anything else with those bits because their decimal values are ok as they are in the sum var. I can't picture in my head what is being accomplished by the & shift operation.

function add(n1,n2)
{
        var carry, sum;

        // Find out which bits will result in a carry.
        // Those bits will affect the bits directly to
        // the left, so we shall shift one bit.
        carry = (n1 & n2) << 1;

        // In digital electronics, an XOR gate is also known
        // as a quarter adder.  Basically an addition is performed
        // on each individual bit, and the carry is discarded.
        //
        // All I'm doing here is applying the same concept.
        sum = n1 ^ n2;

        // If any bits match in position, then perform the
        // addition on the current sum and the results of
        // the carry.
        if (sum & carry)
        {
                return add(sum, carry);
        }

        // Return the sum.
        else
        {
                return sum ^ carry;
        };
};

The code above works as expected but it does not return the floating point values. I've got to have the total to be returned along with the floating point value.

Does anyone have a function that I can use with the above that will help me with floating point values? Are a website with a clear explanation of what I am looking for? I've tried searching for the last day are so and cannot find anything to go look over.

I got the code above from this resource. http://www.dreamincode.net/code/snippet3015.htm

Thanks ahead of time!

After thinking about it doing a left shift to the 1 position is a multiplication by 2.

By &ing like this : carry = (n1 & n2) << 1; the carry var will hold a string of binaries compiled of the matched positions in n1 and n2. So, if n1 is 4 and n2 is 4 they both hold the same value. Therefore, by combing the two and right shifting to the 1 index will multiply 4 x 2 = 8; so carry would now equal 8.

1.) var carry = 00001000 =8 & 00001000 =8

2.) carry = now holds the single value of 00001000 =8

A left shift will multiply 8 x 2 =16, or 8 + 8 = 16

3.)carry = carry <<1 , shift all bits over one position

4.) carry now holds a single value of 00010000 = 16

I still cannot find anything on working with floating point values. If anyone has anything do post a link.

carlosfigueira · Accepted Answer

It doesn't work because the code assumes that the floating point numbers are represented as integer numbers, which they aren't. Floating point numbers are represented using the IEEE 754 standard, which breaks the numbers in three parts: a sign bit, a group of bits representing an exponent, and another group representing a number between 1 (inclusive) and 2 (exclusive), the mantissa, and the value is calculated as

(sign is set ? 1 : -1) * (mantissa ^ (exponent - bias))

Where the bias depends on the precision of the floating point number. So the algorithm you use for adding two numbers assumes that the bits represent an integer which is not the case for floating point numbers. Operations such as bitwise-AND and bitwise-OR also don't give the results that you'd expect in an integer world.

Some examples, in double precision, the number 2.3 is represented as (in hex) 4002666666666666, while the number 5.3 is represented as 4015333333333333. OR-ing those two numbers will give you 4017777777777777, which represents (roughly) 5.866666.

There are some good pointers on this format, I found the links at http://www.psc.edu/general/software/packages/ieee/ieee.php, http://babbage.cs.qc.edu/IEEE-754/ and http://www.binaryconvert.com/convert_double.html fairly good for understanding it.

Now, if you still want to implement the bitwise addition for those numbers, you can. But you'll have to break the number down in its parts, then normalize the numbers in the same exponent (otherwise you won't be able to add them), perform the addition on the mantissa, and finally normalize it back to the IEEE754 format. But, as @LukeGT said, you'll likely not get a better performance than the JS engine you're running. And some JS implementations don't even support bitwise operations on floating point numbers, so what usually ends up happening is that they first cast the numbers to integers, then perform the operation, which will make your results incorrect as well.

Preserving the floating point & addition of a bitwise operation in javascript

Tags:

javascript

floating-point

bit-manipulation

Thanks ahead of time!

Vinyl Windows

1 Answers

carlosfigueira

Recent Activity

Donate For Us

Preserving the floating point & addition of a bitwise operation in javascript

Tags:

javascript

floating-point

bit-manipulation

Thanks ahead of time!

Vinyl Windows

1 Answers

carlosfigueira

Related questions

Recent Activity

Donate For Us