Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SELECT TOP N with UNION and ORDER BY

Given the example below, why does using using UNION ALL with two SELECT TOP 5 statements in Query 2 below seemingly not respect the ORDER BY clause?

Query 1 returns the expected results but does not include the required union. Query 2 demonstrates the unexpected behavior. Query 3 is the workaround I'm currently using to get the desired results.

CREATE TABLE #T1 ([ID] int IDENTITY(1,1), [Description] varchar(100), [Inactive] bit);
CREATE TABLE #T2 ([ID] int IDENTITY(1,1), [Description] varchar(100), [Inactive] bit);

INSERT INTO #T1([Description], [Inactive]) VALUES ('One', 1);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Two', 0);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Three', 1);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Four', 0);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Five', 1);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Six', 0);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Seven', 1);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Eight', 1);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Nine', 1);
INSERT INTO #T1([Description], [Inactive]) VALUES ('Ten', 0);

-- Query 1, works as expected giving all 4 records with Inactive = 0 plus one more
SELECT TOP 5 [ID], [Description], [Inactive]
FROM #T1
ORDER BY [Inactive], [Description]; 

-- Query 2, does not work as expected, as only 2 of the Inactive = 0 records are present
SELECT TOP 5 [ID], [Description], [Inactive]
FROM #T1
UNION ALL
SELECT TOP 5 [ID], [Description], [Inactive]
FROM #T2
ORDER BY [Inactive], [Description]; 

-- Query 3, Workaround to produce desired results
WITH T1 AS (
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T1
    ORDER BY [Inactive], [Description]
),
T2 AS (
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T2
    ORDER BY [Inactive], [Description]
)
SELECT [ID], [Description], [Inactive] FROM T1
UNION ALL
SELECT [ID], [Description], [Inactive] FROM T2
ORDER BY [Inactive], [Description]; 

DROP TABLE #T1;
DROP TABLE #T2;

Obviously, the workaround is working for me but I'd like to understand why Query 2 does not do what I expected it to. In case you're wondering why I am bothering with the empty table #T2, the results are actually limited by a WHERE clause in my production example - however leaving it empty here serves the purpose of providing a comparable example without bothering to populate it.

If you do populate #T2 with the following, I find the results to be just as strange - Query 2 only gives four results with Inactive = 0.

INSERT INTO #T2([Description], [Inactive]) VALUES ('Eleven', 1);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Twelve', 0);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Thirteen', 1);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Fourteen', 0);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Fifteen', 1);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Sixteen', 0);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Seventeen', 1);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Eighteen', 1);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Nineteen', 1);
INSERT INTO #T2([Description], [Inactive]) VALUES ('Twenty', 0);

I have run this script with the same results on SQL Server 2014 and SQL Server 2008 R2.

like image 347
izzy Avatar asked Jun 07 '17 21:06

izzy


1 Answers

A Union query works thus: execute the queries, then apply the order by clause. So with

SELECT TOP 5 [ID], [Description], [Inactive]
FROM #T1
UNION ALL
SELECT TOP 5 [ID], [Description], [Inactive]
FROM #T2
ORDER BY [Inactive], [Description]; 

you select five arbitrarily chosen records from #T1 plus five arbitrarily chosen records from #T2 and then you order these. So you need subqueries or with clauses. E.g.:

SELECT * FROM
(
  (
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T1
    ORDER BY [Inactive], [Description]
  )
  UNION ALL
  (
    SELECT TOP 5 [ID], [Description], [Inactive]
    FROM #T2
    ORDER BY [Inactive], [Description]
  )
) t;

So your workaround is not a workaround at all, but the proper query.

like image 146
Thorsten Kettner Avatar answered Oct 19 '22 14:10

Thorsten Kettner