Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating an even random range in Netezza

Netezza says that its random() function generates a float between 0.0 and 0.99999... I'm trying to generate random ints within a range (e.g., between 5 and 10). The internet told me to use an equation like this to scale a fraction to a number in a range:

select f,count(*) from (
    select CAST(5 + random() * (10-5) as INT) as f
    from table_of_numbers
    where number between 1 and 5000
) x group by 1 order by 1

However, when I use that code, the extreme values are under-represented in the sample:

F    COUNT
5    486    <---
6    992
7    1057
8    1000
9    937
10   528    <---
Does anyone know how I can fix this?

Thanks!

like image 277
Chris Avatar asked Feb 17 '23 14:02

Chris


2 Answers

The expression random()*(10-5) produces numbers from 0 to 4.99999. However, you have a range of 6 values (5, 6, 7, 8, 9, and 10). So your expression is splitting 5 values across 6 buckets.

Your code is finding that the first and last are half full. Apparently, the cast() operation is rounding the values rather than truncating them (I don't think this is ANSI SQL, but it would explain your observed results). This masks the problem.

Try this:

select CAST(4.5 + random() * (10-5+1) as INT) 

In other databases, something like this should work:

select CAST(5 + random() * (10-5+1) as INT) 
like image 141
Gordon Linoff Avatar answered Feb 19 '23 04:02

Gordon Linoff


This slightly modified query produced an even distribution of random numbers from 5 to 10 (inclusively) on a big table (~70m rows) on my environment:

select f, count(*) from (
    select 5 + floor(random() * 5.99)::int as f
    from some_big_table
) x group by 1 order by 1;

F   COUNT
5   11659920
6   11663534
7   11665070
8   11668845
9   11665256
10  11549193

The key here is to tell the system to round the random number down to the nearest integer value.

like image 36
Oleg Avatar answered Feb 19 '23 04:02

Oleg