Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is uninitialized local variable the fastest random number generator?

I know the uninitialized local variable is undefined behaviour(UB), and also the value may have trap representations which may affect further operation, but sometimes I want to use the random number only for visual representation and will not further use them in other part of program, for example, set something with random color in a visual effect, for example:

void updateEffect(){
    for(int i=0;i<1000;i++){
        int r;
        int g;
        int b;
        star[i].setColor(r%255,g%255,b%255);
        bool isVisible;
        star[i].setVisible(isVisible);
    }
}

is it that faster than

void updateEffect(){
    for(int i=0;i<1000;i++){
        star[i].setColor(rand()%255,rand()%255,rand()%255);
        star[i].setVisible(rand()%2==0?true:false);
    }
}

and also faster than other random number generator?

like image 909
ggrr Avatar asked Jul 31 '15 06:07

ggrr


People also ask

What is uninitialized local variable?

uninitialized local variable 'name' used. The local variable name has been used, that is, read from, before it has been assigned a value. In C and C++, local variables are not initialized by default. Uninitialized variables can contain any value, and their use leads to undefined behavior.

Why are uninitialized variables bad?

It is undefined behavior and it's much much worse than "the variable has some garbage value". Furthermore, on your system could work as you expect, on other systems or with other compiler flags it can behave in unexpected ways.

What happens when you try to use an uninitialized variable?

An uninitialized variable is a variable that has not been given a value by the program (generally through initialization or assignment). Using the value stored in an uninitialized variable will result in undefined behavior.

What does it mean when a variable is uninitialized in C?

Description. The code uses a variable that has not been initialized, leading to unpredictable or unintended results. Extended Description. In some languages such as C and C++, stack variables are not initialized by default.


3 Answers

As others have noted, this is Undefined Behavior (UB).

In practice, it will (probably) actually (kind of) work. Reading from an uninitialized register on x86[-64] architectures will indeed produce garbage results, and probably won't do anything bad (as opposed to e.g. Itanium, where registers can be flagged as invalid, so that reads propagate errors like NaN).

There are two main problems though:

  1. It won't be particularly random. In this case, you're reading from the stack, so you'll get whatever was there previously. Which might be effectively random, completely structured, the password you entered ten minutes ago, or your grandmother's cookie recipe.

  2. It's Bad (capital 'B') practice to let things like this creep into your code. Technically, the compiler could insert reformat_hdd(); every time you read an undefined variable. It won't, but you shouldn't do it anyway. Don't do unsafe things. The fewer exceptions you make, the safer you are from accidental mistakes all the time.

The more pressing issue with UB is that it makes your entire program's behavior undefined. Modern compilers can use this to elide huge swaths of your code or even go back in time. Playing with UB is like a Victorian engineer dismantling a live nuclear reactor. There's a zillion things to go wrong, and you probably won't know half of the underlying principles or implemented technology. It might be okay, but you still shouldn't let it happen. Look at the other nice answers for details.

Also, I'd fire you.

like image 175
imallett Avatar answered Oct 22 '22 02:10

imallett


Let me say this clearly: we do not invoke undefined behavior in our programs. It is never ever a good idea, period. There are rare exceptions to this rule; for example, if you are a library implementer implementing offsetof. If your case falls under such an exception you likely know this already. In this case we know using uninitialized automatic variables is undefined behavior.

Compilers have become very aggressive with optimizations around undefined behavior and we can find many cases where undefined behavior has lead to security flaws. The most infamous case is probably the Linux kernel null pointer check removal which I mention in my answer to C++ compilation bug? where a compiler optimization around undefined behavior turned a finite loop into an infinite one.

We can read CERT's Dangerous Optimizations and the Loss of Causality (video) which says, amongst other things:

Increasingly, compiler writers are taking advantage of undefined behaviors in the C and C++ programming languages to improve optimizations.

Frequently, these optimizations are interfering with the ability of developers to perform cause-effect analysis on their source code, that is, analyzing the dependence of downstream results on prior results.

Consequently, these optimizations are eliminating causality in software and are increasing the probability of software faults, defects, and vulnerabilities.

Specifically with respect to indeterminate values, the C standard defect report 451: Instability of uninitialized automatic variables makes for some interesting reading. It has not been resolved yet but introduces the concept of wobbly values which means the indeterminatness of a value may propagate through the program and can have different indeterminate values at different points in the program.

I don't know of any examples where this happens but at this point we can't rule it out.

Real examples, not the result you expect

You are unlikely to get random values. A compiler could optimize the away the loop altogether. For example, with this simplified case:

void updateEffect(int  arr[20]){
    for(int i=0;i<20;i++){
        int r ;    
        arr[i] = r ;
    }
}

clang optimizes it away (see it live):

updateEffect(int*):                     # @updateEffect(int*)
    retq

or perhaps get all zeros, as with this modified case:

void updateEffect(int  arr[20]){
    for(int i=0;i<20;i++){
        int r ;    
        arr[i] = r%255 ;
    }
}

see it live:

updateEffect(int*):                     # @updateEffect(int*)
    xorps   %xmm0, %xmm0
    movups  %xmm0, 64(%rdi)
    movups  %xmm0, 48(%rdi)
    movups  %xmm0, 32(%rdi)
    movups  %xmm0, 16(%rdi)
    movups  %xmm0, (%rdi)
    retq

Both of these cases are perfectly acceptable forms of undefined behavior.

Note, if we are on an Itanium we could end up with a trap value:

[...]if the register happens to hold a special not-a-thing value, reading the register traps except for a few instructions[...]

Other important notes

It is interesting to note the variance between gcc and clang noted in the UB Canaries project over how willing they are to take advantage of undefined behavior with respect to uninitialized memory. The article notes (emphasis mine):

Of course we need to be completely clear with ourselves that any such expectation has nothing to do with the language standard and everything to do with what a particular compiler happens to do, either because the providers of that compiler are unwilling to exploit that UB or just because they have not gotten around to exploiting it yet. When no real guarantee from the compiler provider exists, we like to say that as-yet unexploited UBs are time bombs: they’re waiting to go off next month or next year when the compiler gets a bit more aggressive.

As Matthieu M. points out What Every C Programmer Should Know About Undefined Behavior #2/3 is also relevant to this question. It says amongst other things (emphasis mine):

The important and scary thing to realize is that just about any optimization based on undefined behavior can start being triggered on buggy code at any time in the future. Inlining, loop unrolling, memory promotion and other optimizations will keep getting better, and a significant part of their reason for existing is to expose secondary optimizations like the ones above.

To me, this is deeply dissatisfying, partially because the compiler inevitably ends up getting blamed, but also because it means that huge bodies of C code are land mines just waiting to explode.

For completeness sake I should probably mention that implementations can choose to make undefined behavior well defined, for example gcc allows type punning through unions while in C++ this seems like undefined behavior. If this is the case the implementation should document it and this will usually not be portable.

like image 215
Shafik Yaghmour Avatar answered Oct 22 '22 01:10

Shafik Yaghmour


No, it's terrible.

The behaviour of using an uninitialised variable is undefined in both C and C++, and it's very unlikely that such a scheme would have desirable statistical properties.

If you want a "quick and dirty" random number generator, then rand() is your best bet. In its implementation, all it does is a multiplication, an addition, and a modulus.

The fastest generator I know of requires you to use a uint32_t as the type of the pseudo-random variable I, and use

I = 1664525 * I + 1013904223

to generate successive values. You can choose any initial value of I (called the seed) that takes your fancy. Obviously you can code that inline. The standard-guaranteed wraparound of an unsigned type acts as the modulus. (The numeric constants are hand-picked by that remarkable scientific programmer Donald Knuth.)

like image 166
Bathsheba Avatar answered Oct 22 '22 02:10

Bathsheba