Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

INSERT INTO SELECT strange order using UNION

I had a typical non-normalized table (tempTable) with multiple numbered columns (rep1,rep2,...). So i wrote a script to insert the non-normalized data into a normalized table (myTable):

insert into myTable
select idRep,rep FROM
(
    select idRep, ISNULL(rep1,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep2,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep3,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep4,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep5,'') as rep FROM tempTable
) as t

Note: The table myTable also contains an auto-incremented IDENTITY column as its PRIMARY KEY.

The order rep1, rep2, rep3, rep4, rep5 is important in my scenario. Strangely, when I executed the script, the data wasn't inserted in the correct order such as the auto-generated id '1000' had the value from 'rep3' and the id '1001' had the value from 'rep1'.

Why is that? How was the script executed?

like image 332
Francis P Avatar asked Dec 07 '22 12:12

Francis P


1 Answers

The reason it is not going in the order you expect when using UNION is that union attempts to impose uniquness, so it is processing all of those rows together and bringing them out in the order most convenient for the engine.

If you switch to UNION ALL (which does not try to impose uniqueness) as Parado suggested it will not do the processing and they will go into the table in the order you put them in, almost all the time. This however is not gaurunteed and certain very unusual circumstances going on in other processes (especially ones that somehow touch on your tempTable) can affect it.

If you use an order by as Kash suggests then that will gauruntee the order of the ids (which can matter), but not technically the order that the rows get inserted (which very rarely matters in practice).

There is a good summary of some of this on MSDN.

So, that takes care of the why. As for the how to get what you actually want, I would use Kash's suggestion of adding a column to use with an order by clause, but I would use UNION ALL instead of UNION. Using UNION is like adding and implicit "distinct" requirement, which takes up processor cycles and makes the query plan more complicated.

like image 145
TimothyAWiseman Avatar answered Jan 01 '23 22:01

TimothyAWiseman