Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing XorShift the same in Java and Python

I would like to implement an XorShift PRNG in both Java, Python and JavaScript. The different implementations must generate the exact same sequences given the same seed. So far, I've have not been able to do this.

My implementation in Java

have the following implementation of an XorShift PRNG in Java (where x is a long field):

public long randomLong() {
    x ^= (x << 21);
    x ^= (x >>> 35);
    x ^= (x << 4);
    return x;
}

If I seed x to 1, the first four calls to randomLong() will generate:

35651601
1130297953386881
-9204155794254196429
144132848981442561

My implementation in Python

I have tried both with and without numpy. Below is the version that uses numpy.

def randomLong(self):
    self.x ^= np.left_shift(self.x, 21)
    self.x ^= np.right_shift(self.x, 35)
    self.x ^= np.left_shift(self.x, 4)
    return self.x

With the same seed, the Python function will generate:

35651601
1130297953386881
-9204155787274874573 # different
143006948545953793   # different

My JavaScript implementation

I've not attempted one yet, since JavaScript's only number type seems to be doubles based on IEEE 754, which opens up a different can of worms.

What I think the cause is

Java and Python have different number types. Java has 32 and 64-bit integers, while Python has funky big int types.

It seems that the shift operators have different semantics. For example, in Java there is both logical and arithmetic shift, while in Python there is only one type of shift (logical?).

Questions

I would be happy with an answer that lets me write a PRNG in these three languages, and one that is fast. It does not have to be very good. I have considered porting C libs implementations to the other languages, although it is not very good.

  • Can I fix my above implementations so they work?
  • Should I switch to another PRNG function that is easier to implement across prog.langs?

I have read the SO where someone suggested using the java.util.Random class for Python. I don't want this, since I'm also going to need the function in JavaScript, and I don't know that this packages exists there.

like image 755
Pimin Konstantin Kefaloukos Avatar asked Jan 07 '16 15:01

Pimin Konstantin Kefaloukos


3 Answers

I would be happy with an answer that lets me write a PRNG in these three languages, and one that is fast. It does not have to be very good.

You could implement a 32-bit linear congruential generator in 3 languages.

Python:

seed = 0
for i in range(10):
    seed = (seed * 1664525 + 1013904223) & 0xFFFFFFFF
    print(seed)

Java:

int seed = 0;
for (int i = 0; i < 10; i++) {
    seed = seed * 1664525 + 1013904223;
    System.out.println(seed & 0xFFFFFFFFL);
}

JavaScript:

var seed = 0;
for (var i = 0; i < 10; i++) {
    // The intermediate result fits in 52 bits, so no overflow
    seed = (seed * 1664525 + 1013904223) | 0;
    console.log(seed >>> 0);
}

Output:

1013904223
1196435762
3519870697
2868466484
1649599747
2670642822
1476291629
2748932008
2180890343
2498801434

Note that in all 3 languages, each iteration prints an unsigned 32-bit integer.

like image 191
Nayuki Avatar answered Nov 07 '22 01:11

Nayuki


The tricky part is in the logical right shift. The easiest to do in Python if you have access to NumPy, is to store your x as a uint64 value, so that arithmetic and logical right shifting are the exact same operation, and cast the output value to an int64 before returning, e.g.:

import numpy as np

class XorShiftRng(object):
    def __init__(self, x):
        self.x = np.uint64(x)

    def random_long(self):
        self.x ^= self.x << np.uint64(21)
        self.x ^= self.x >> np.uint64(35)
        self.x ^= self.x << np.uint64(4)
        return np.int64(self.x)

Those ugly casts of the shift values are required to prevent NumPy from issuing weird casting errors. In any case, this produces the exact same result as your Java version:

>>> rng = XorShiftRng(1)
>>> for _ in range(4):
...     print(rng.random_long())
...
35651601
1130297953386881
-9204155794254196429
144132848981442561
like image 3
Jaime Avatar answered Nov 07 '22 02:11

Jaime


The difference in results between Java and python is due to a difference in how the languages have implemented integers. A java long is a 64 bit signed integer, having the sign in the leftmost bit. Python is... well, different.

Presumably python encodes integers with varying bit length depending of the magnitude of the number

>>> n = 10
>>> n.bit_length()
4

>>> n = 1000
>>> n.bit_length()
10

>>> n = -4
>>> n.bit_length()
3

And negative integers are (presumably) encoded as sign and magnitude, though the sign does not seem to be set in any of the bits. The sign would normally be in the leftmost bit, but not here. I guess this has to do with pythons varying bit length for numbers.

>>> bin(-4)
'-0b100'

where -4 in 64 bit 2's complement would be:

0b1111111111111111111111111111111111111111111111111111111111111100

This makes a huge difference in the algorithm, since shifting 0b100 left or right yields quite different results than shifting 0b1111111111111111111111111111111111111111111111111111111111111100.

Luckily there's a way of tricking python, but this involves switching between the two representations yourself.

First some bit masks is needed:

word_size = 64
sign_mask = 1<<(word_size-1)
word_mask = sign_mask | (sign_mask - 1)

Now to force python into 2's complement, all one need is a logical 'and' with the word mask

>>> bin(4 & word_mask)
'0b100'

>>> bin(-4 & word_mask)
'0b1111111111111111111111111111111111111111111111111111111111111100'

which is what you need for the algorithm to work. Except you need to convert the numbers back when returning values, since

>>> -4 & word_mask
18446744073709551612L

So the number needs to be converted from 2's complement to signed magnitude:

>>> number = -4 & word_mask
>>> bin(~(number^word_mask))
'-0b100'

But this only works for negative integers:

>>> number = 4 & word_mask
>>> bin(~(number^word_mask))
'-0b1111111111111111111111111111111111111111111111111111111111111100'

Since positive integers should be returned as is, this would be better:

>>> number = -4 & word_mask
>>> bin(~(number^word_mask) if (number&sign_mask) else number)
'-0b100'

>>> number = 4 & word_mask
>>> bin(~(number^word_mask) if (number&sign_mask) else number)
'0b100'

So I've implemented the algorithm like this:

class XORShift:
    def __init__(self, seed=1, word_length=64):
        self.sign_mask = (1 << (word_length-1))
        self.word_mask = self.sign_mask | (self.sign_mask -1)
        self.next = self._to2scomplement(seed)

    def _to2scomplement(self, number):
        return number & self.word_mask

    def _from2scomplement(self, number):
        return ~(number^self.word_mask) if (number & self.sign_mask) else number

    def seed(self, seed):
        self.next = self._to2scomplement(seed)

    def random(self):
        self.next ^= (self.next << 21) & self.word_mask
        self.next ^= (self.next >> 35) & self.word_mask
        self.next ^= (self.next <<  4) & self.word_mask
        return self._from2scomplement(self.next)

And seeding it with 1, the algorithm returns as its 4 first numbers:

>>> prng = XORShift(1)
>>> for _ in range(4):
>>>     print prng.random()

35651601
1130297953386881
-9204155794254196429
144132848981442561

Of course you get this for free by using numpy.int64, but this is less fun as it hides the cause difference.

I have not been able to implement the same algorithm in JavaScript. It seems that JavaScript uses 32 bit unsigned integers and the shifting 35 positions right, the number wraps around. I have not investigated it further.

like image 1
Andreas Ryge Avatar answered Nov 07 '22 00:11

Andreas Ryge