Xorshift1024* jump not commutative?

Tags:

I've been porting Sebastiano Vigna's xorshift1024* PRNG to be compatible with the standard C++11 uniform random number generator contract and noticed some strange behavior with the jump() function he provides.

According to Vigna, a call to jump() should be equivalent to 2^512 calls to next(). Therefore a series of calls to jump() and next() should be commutative. For example, assuming the generator starts in some known state,

jump();
next();

should leave the generator in the same state as

next();
jump();

since both should be equivalent to

for (bigint i = 0; i < (bigint(1) << 512) + 1; ++i)
    next();

assuming bigint is some integer type with an extremely large maximum value (and assuming you are a very, very, very patient person).

Unfortunately, this doesn't work with the reference implementation Vigna provides (which I will include at the end for posterity; in case the implementation linked above changes or is taken down in the future). When testing the first two options using the following test code:

memset(s, 0xFF, sizeof(s));
p = 0;

// jump() and/or next() calls...

std::cout << p << ';';
for (int i = 0; i < 16; ++i)
    std::cout << ' ' << s[i];

calling jump() before next() outputs:

1; 9726214034378009495 13187905351877324975 10033047168458208082 990371716258730972 965585206446988056 74622805968655940 11468976784638207029 3005795712504439672 6792676950637600526 9275830639065898170 6762742930827334073 16862800599087838815 13481924545051381634 16436948992084179560 6906520316916502096 12790717607058950780

while calling next() first results in:

1; 13187905351877324975 10033047168458208082 990371716258730972 965585206446988056 74622805968655940 11468976784638207029 3005795712504439672 6792676950637600526 9275830639065898170 6762742930827334073 16862800599087838815 13481924545051381634 16436948992084179560 6906520316916502096 12790717607058950780 9726214034378009495

Clearly either my understanding of what jump() is doing is wrong, or there's a bug in the jump() function, or the jump polynomial data is wrong. Vigna claims that such a jump function can be calculated for any stride of the period, but doesn't elaborate on how to calculate it (including in his paper on xorshift* generators). How can I calculate the correct jump data to verify that there's not a typo somewhere in it?

Xorshift1024* reference implementation; http://xorshift.di.unimi.it/xorshift1024star.c

/*  Written in 2014-2015 by Sebastiano Vigna ([email protected])

To the extent possible under law, the author has dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.

See <http://creativecommons.org/publicdomain/zero/1.0/>. */

#include <stdint.h>
#include <string.h>

/* This is a fast, top-quality generator. If 1024 bits of state are too
   much, try a xorshift128+ generator.

   The state must be seeded so that it is not everywhere zero. If you have
   a 64-bit seed, we suggest to seed a splitmix64 generator and use its
   output to fill s. */

uint64_t s[16]; 
int p;

uint64_t next(void) {
    const uint64_t s0 = s[p];
    uint64_t s1 = s[p = (p + 1) & 15];
    s1 ^= s1 << 31; // a
    s[p] = s1 ^ s0 ^ (s1 >> 11) ^ (s0 >> 30); // b,c
    return s[p] * UINT64_C(1181783497276652981);
}


/* This is the jump function for the generator. It is equivalent
   to 2^512 calls to next(); it can be used to generate 2^512
   non-overlapping subsequences for parallel computations. */

void jump() {
    static const uint64_t JUMP[] = { 0x84242f96eca9c41dULL,
        0xa3c65b8776f96855ULL, 0x5b34a39f070b5837ULL, 0x4489affce4f31a1eULL,
        0x2ffeeb0a48316f40ULL, 0xdc2d9891fe68c022ULL, 0x3659132bb12fea70ULL,
        0xaac17d8efa43cab8ULL, 0xc4cb815590989b13ULL, 0x5ee975283d71c93bULL,
        0x691548c86c1bd540ULL, 0x7910c41d10a1e6a5ULL, 0x0b5fc64563b3e2a8ULL,
        0x047f7684e9fc949dULL, 0xb99181f2d8f685caULL, 0x284600e3f30e38c3ULL
    };

    uint64_t t[16] = { 0 };
    for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
        for(int b = 0; b < 64; b++) {
            if (JUMP[i] & 1ULL << b)
                for(int j = 0; j < 16; j++)
                    t[j] ^= s[(j + p) & 15];
            next();
        }

    memcpy(s, t, sizeof t);
}

765

asked Jan 03 '16 08:01

bcrist

2 Answers

OK, I'm sorry but sometimes this happens (I'm the author).

Originally the function had two memcpy(). Then I realised then a circular copy was needed. But I replaced just the first memcpy(). Stupid, stupid, stupid. All files on the site have been fixed. The arXiv copy is undergoing update. See http://xorshift.di.unimi.it/xorshift1024star.c

Incidentally: I didn't "publish" anything wrong in the scientific sense, as the jump() function is not part of the ACM Trans. Math. Soft. paper—it just has been added few weeks ago on the site and on the arXiv/WWW version. The fast publication path of the web and arXiv means that, sometimes, one distributes unpolished papers. I can only thank the reporter for reporting this bug (OK, technically StackOverflow is not reporting bugs, but I got an email, too).

Unfortunately, the unit test I had did not consider the case p ≠ 0. My main concern was that the correctness of the computed polynomial. The function, as noted above, is correct when p = 0.

As for the computation: to each generator corresponds a characteristic polynomial P(x). The jump polynomial for k is just x^k mod P(x). I use fermat to compute such powers, and then I have some scripts generating the C code.

Of course I can't test 2^512, but since my generation code works perfectly from 2 to 2^30 (the range you can easily test), I'm confident it works at 2^512, too. It's just fermat computing x^{2^512} instead of x^{2^30}. But independent verifications are more than welcome.

I have code working only for powers of the form x^{2^t}. This is what I need to compute useful jump functions. Computing polynomials modulo P(x) is not difficult, so one could conceivably have a completely generic jump function for any value, but frankly I find this totally overkill.

If anybody is interested in getting other jump polynomials, I can provide the scripts. They will be part, as it happens for all other code, of the next xorshift distribution, but I need to complete the documentation before giving them out.

For the record, the characteristic polynomial of xorshift1024* is x^1024 + x^974 + x^973 + x^972 + x^971 + x^966 + x^965 + x^964 + x^963 + x^960 + x^958 + x^957 + x^956 + x^955 + x^950 + x^949 + x^948 + x^947 + x^942 + x^941 + x^940 + x^939 + x^934 + x^933 + x^932 + x^931 + x^926 + x^925 + x^923 + x^922 + x^920 + x^917 + x^916 + x^915 + x^908 + x^906 + x^904 + x^902 + x^890 + x^886 + x^873 + x^870 + x^857 + x^856 + x^846 + x^845 + x^844 + x^843 + x^841 + x^840 + x^837 + x^835 + x^830 + x^828 + x^825 + x^824 + x^820 + x^816 + x^814 + x^813 + x^811 + x^810 + x^803 + x^798 + x^797 + x^790 + x^788 + x^787 + x^786 + x^783 + x^774 + x^772 + x^771 + x^770 + x^769 + x^768 + x^767 + x^765 + x^760 + x^758 + x^753 + x^749 + x^747 + x^746 + x^743 + x^741 + x^740 + x^738 + x^737 + x^736 + x^735 + x^728 + x^726 + x^723 + x^722 + x^721 + x^720 + x^718 + x^716 + x^715 + x^714 + x^710 + x^709 + x^707 + x^694 + x^687 + x^686 + x^685 + x^684 + x^679 + x^678 + x^677 + x^674 + x^670 + x^669 + x^667 + x^666 + x^665 + x^663 + x^658 + x^655 + x^651 + x^639 + x^638 + x^635 + x^634 + x^632 + x^630 + x^623 + x^621 + x^618 + x^617 + x^616 + x^615 + x^614 + x^613 + x^609 + x^606 + x^604 + x^601 + x^600 + x^598 + x^597 + x^596 + x^594 + x^593 + x^592 + x^590 + x^589 + x^588 + x^584 + x^583 + x^582 + x^581 + x^579 + x^577 + x^575 + x^573 + x^572 + x^571 + x^569 + x^567 + x^565 + x^564 + x^563 + x^561 + x^559 + x^557 + x^556 + x^553 + x^552 + x^550 + x^544 + x^543 + x^542 + x^541 + x^537 + x^534 + x^532 + x^530 + x^528 + x^526 + x^523 + x^521 + x^520 + x^518 + x^516 + x^515 + x^512 + x^511 + x^510 + x^508 + x^507 + x^506 + x^505 + x^504 + x^502 + x^501 + x^499 + x^497 + x^494 + x^493 + x^492 + x^491 + x^490 + x^487 + x^485 + x^483 + x^482 + x^480 + x^479 + x^477 + x^476 + x^475 + x^473 + x^469 + x^468 + x^465 + x^463 + x^461 + x^460 + x^459 + x^458 + x^455 + x^453 + x^451 + x^448 + x^447 + x^446 + x^445 + x^443 + x^438 + x^437 + x^431 + x^430 + x^429 + x^428 + x^423 + x^417 + x^416 + x^415 + x^414 + x^412 + x^410 + x^409 + x^408 + x^400 + x^398 + x^396 + x^395 + x^391 + x^390 + x^386 + x^385 + x^381 + x^380 + x^378 + x^375 + x^373 + x^372 + x^369 + x^368 + x^365 + x^360 + x^358 + x^357 + x^354 + x^350 + x^348 + x^346 + x^345 + x^344 + x^343 + x^342 + x^340 + x^338 + x^337 + x^336 + x^335 + x^333 + x^332 + x^325 + x^323 + x^318 + x^315 + x^313 + x^309 + x^308 + x^305 + x^303 + x^302 + x^300 + x^294 + x^290 + x^281 + x^279 + x^276 + x^275 + x^273 + x^272 + x^267 + x^263 + x^262 + x^261 + x^260 + x^258 + x^257 + x^256 + x^249 + x^248 + x^243 + x^242 + x^240 + x^238 + x^236 + x^233 + x^232 + x^230 + x^228 + x^225 + x^216 + x^214 + x^212 + x^210 + x^208 + x^206 + x^205 + x^200 + x^197 + x^196 + x^184 + x^180 + x^176 + x^175 + x^174 + x^173 + x^168 + x^167 + x^166 + x^157 + x^155 + x^153 + x^152 + x^151 + x^150 + x^144 + x^143 + x^136 + x^135 + x^125 + x^121 + x^111 + x^109 + x^107 + x^105 + x^92 + x^90 + x^79 + x^78 + x^77 + x^76 + x^60 + 1

answered Nov 16 '22 01:11

seba

tldr: I'm pretty sure there's a bug in the original code:
The memcpy in jump() must consider the p rotation too.
The author didn't test nearly as much as appropriate before publishing a paper...

Long version:

One next() call changes only one of the 16 s array elements, the one with index p. p starts at 0, gets increased each next() call, and after 15 it becomes 0 again. Let's call s[p] the "current" array element. Another (slower) possibility for implementing next() would be that the current element is always the first one, there is no p, and instead of incrementing p the whole s array is rotated (ie. the first element moves to the last position and the previous second element becomes the first).

Independent of the current p value, 16 calls to next() should result in the same p value as before, ie. the whole cycle is done and the current element is the same position as before the 16 calls. jump() should do 2^512 next(), 2^512 is a multiple of 16, so with one jump, the p value before and after it should be the same.

You probably noticed already that your two different results are only rotated one time, ie. one solution is "9726214034378009495 somethingelse" and one is "somethingelse 9726214034378009495"

...because you did one next() before/after the jump() and jump() can't handle p other than 0.

If you'd test it with 16 next() (or 32 or 0 or ...) before/after jump() instead of one, the two results are equal. The reason is, within jump, while for the s array the current element / p is handled as it is in next(), the t array is semantically rotated so that the current element is always the first one (t[j] ^= s[(j + p) & 15];). Then, right before the function terminates, memcpy(s, t, sizeof t); copies the new values from t back to s without considering the rotation at all. Just replace the memcpy with a proper loop including the p offset, then it should be fine.

(Well, but that doesn't mean jump() is really the same as 2^512 next(). But at least it could be.)

answered Nov 16 '22 02:11

deviantfan

Related questions
                            
                                Interdependent initialization with commas?
                            
                                Time complexity of iterating through a C++ unordered_map [duplicate]
                            
                                Convert char* to uint8_t
                            
                                Test that two std::vectors are equal using CATCH C++ unit test framework
                            
                                Call a C++ function from Swift
                            
                                Behavior when dereferencing the .end() of a vector of strings
                            
                                Is there a penalty to using char variables in CUDA kernels?
                            
                                Why does std::vector::reserve call the copy constructor?
                            
                                When do I use std::random_device?
                            
                                How implicit conversion works for non-type template parameters?
                            
                                Are global variables extern by default or is it equivalent to declaring variable with extern in global?
                            
                                Which way is better to get lower 32 bits of a 64 bits integer
                            
                                Why does < instead of << in stream output still compile?
                            
                                Installation of R-package "BH" not possible
                            
                                How does condition_variable::wait_for() deal with spurious wakeups?
                            
                                Minimal working example for emscripten webworker
                            
                                Building Visual Studio 2015 C++ project (v140) by MSBuild Failed - v140 cannot be found
                            
                                Is there a NULL equivalent for pairs in C++?
                            
                                What is the functional difference between a const pointer (not a pointer to const) and a reference?
                            
                                Reading binary file into vector<char> reading less than full file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With