Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shuffle a column between rows

Tags:

sql

oracle

How can I shuffle the contents of a large (1m to 5m record) table efficiently? The column is known to have unique values, but you can assume that all constraints are removed for the purposes of this. My headaches are primarily because I am updating the same column I am selecting from. My goal is to do this with PL/SQL so that I can take additional action programmatically such as logging or updating other tables.

**Original table:**
+----+-----------+
| id | fname     |
+----+-----------+
|  1 | mike      |
|  2 | ricky     |
|  3 | jane      |
|  4 | august    |
|  6 | dave      |
|  9 | Jérôme    |
+----+-----------+

**Possible output:**
+----+-----------+
| id | fname     |
+----+-----------+
|  1 | dave      |
|  2 | jane      |
|  3 | mike      |
|  4 | ricky     |
|  6 | Jérôme    |
|  9 | august    |
+----+-----------+

My latest attempts have been to create a cursor that uses over (order by dbms_random.value) and to try to do a merge or update perhaps based on rownum. Perhaps I can get around the modifying self constraint by creating a temp table of sorts? I'm fairly confident Oracle has some fancy way to do this but I am limited in my SQL abilities to the basic CRUD commands.

The full solution is here, based on Gordon's answer:

merge into t
using (
select t.id, t2.name
from (select t.*, rownum as seqnum
      from t
     ) t join
     (select t.*, row_number() over (order by dbms_random.value) as seqnum
      from t
     ) t2
     on t.seqnum = t2.seqnum
) src
on (t.id = src.id)
when matched then update set t.name = src.name;
like image 763
user1 Avatar asked Dec 07 '16 03:12

user1


People also ask

How do I randomly shuffle a column in Excel?

In case you want to shuffle the list again, just hit the F9 key. This will force the RAND formula to recalculate and it will give you a new set of random numbers. Now you can sort the list of names based on this new random number dataset and you will have the new shuffled list of names.


2 Answers

You can do a self join, using random row numbers:

select t.id, t2.name
from (select t.*, row_number() over (order by dbms_random.value) as seqnum
      from t
     ) t join
     (select t.*, row_number() over (order by dbms_random.value) as seqnum
      from t
     ) t2
     on t.seqnum = t2.seqnum;

Actually, you don't need for both to be randomized:

select t.id, t2.name
from (select t.*, rownum as seqnum
      from t
     ) t join
     (select t.*, row_number() over (order by dbms_random.value) as seqnum
      from t
     ) t2
     on t.seqnum = t2.seqnum;
like image 125
Gordon Linoff Avatar answered Sep 30 '22 02:09

Gordon Linoff


Taken directly from this answer (it was mine, so I believe I am allowed to reuse it): https://community.oracle.com/thread/3995265

Preparation

create table original_table ( id number, name varchar2(30) );

insert into original_table
  select 1, 'mike'   from dual union all
  select 2, 'ricky'  from dual union all
  select 3, 'jane'   from dual union all
  select 4, 'august' from dual union all
  select 6, 'dave'   from dual union all
  select 9, 'Jérôme' from dual
;

select * from original_table;

ID  NAME
--  ------
1   mike
2   ricky
3   jane
4   august
6   dave
9   Jérôme

Updating the rows with permuted names:

merge into original_table o
  using (
    with
         helper ( id, rn, rand_rn ) as (
           select id,
                  row_number() over (order by id),
                  row_number() over (order by dbms_random.value())
           from   original_table
         )
    select ot.name, h2.id
    from   original_table ot inner join helper h1 on      ot.id = h1.id
                             inner join helper h2 on h1.rand_rn = h2.rn
  ) p
on (o.id = p.id)
when matched then update set o.name = p.name
;

select * from original_table;

ID  NAME
--  ------
1   ricky
2   dave
3   Jérôme
4   jane
6   august
9   mike
like image 31
mathguy Avatar answered Sep 30 '22 03:09

mathguy