Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange random behavior in where clause

Tags:

sql

oracle

plsql

I have a table like this:

     Id | GroupId | Category
     ------------------------
     1  | 101     | A
     2  | 101     | B
     3  | 101     | C
     4  | 103     | B
     5  | 103     | D
     6  | 103     | A
     ........................

I need select one of the GroupId randomly. For this I have used the following PL/SQL code block:

declare v_group_count number;
  v_group_id number;
begin 
  select count(distinct GroupId) into v_group_count from MyTable;
  SELECT GroupId into v_group_id  FROM
  (
    SELECT GroupId, ROWNUM RN FROM 
    (SELECT DISTINCT GroupId FROM MyTable)
  )
  WHERE RN=Round(dbms_random.value(1, v_group_count));
end;

Because I rounded random value then it will be an integer value and the WHERE RN=Round(dbms_random.value(1, v_group_count)) condition must return one row always. Generally it gives me one row as expected. But strangely sometimes it gives me no rows and sometimes it returns two rows. That's why it gives error in this section:

SELECT GroupId into v_group_id

Anyone knows the reason of that behaviour?

like image 234
Farid Imranov Avatar asked Nov 15 '16 12:11

Farid Imranov


2 Answers

round(dbms_random.value(1, v_group_count)) is being executed for every row, so every row might be selected or not.


P.s.

ROUND is a bad choice.

The probability of getting any of the edge values (e.g. 1 and 10) is half the probability of getting any other value (e.g. 2 to 9).
It is 0.0555... (1/18) Vs. 0.111... (1/9)

[  1,1.5) --> 1
[1.5,2.5) --> 2
.
.
.
[8.5,9.5) --> 9
[9.5, 10) --> 10

select          n,count(*)
from           (select          round(dbms_random.value(1, 10)) as n
                from            dual
                connect by      level <= 100000
                )
group by        n
order by        n
;
 

    N   COUNT(*)
    1   5488
    2   11239
    3   11236
    4   10981
    5   11205
    6   11114
    7   11211
    8   11048
    9   10959
    10  5519

My recommendation is to use FLOOR on dbms_random.value(1,N+1)

    select          n,count(*)
    from           (select          floor(dbms_random.value(1, 11)) as n
                    from            dual
                    connect by      level <= 100000
                    )
    group by        n
    order by        n   
    ;              
                   

N   COUNT(*)
1   10091
2   10020
3   10020
4   10021
5   9908
6   10036
7   10054
8   9997
9   9846
10  10007              
like image 56
David דודו Markovitz Avatar answered Nov 15 '22 10:11

David דודו Markovitz


If you want to select one randomly:

declare v_group_count number;
  v_group_id number;
begin 
  SELECT GroupId into v_group_id
  FROM (SELECT DISTINCT GroupId
        FROM MyTable
        ORDER BY dbms_random.value
       ) t
  WHERE rownum = 1
end;
like image 25
Gordon Linoff Avatar answered Nov 15 '22 09:11

Gordon Linoff