Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intersect in SQL Server

Is there a way to use intersect without selecting distinct values only? Something like INTERSECT ALL.

For example, consider table A and B

A --> 1, 1, 1, 2, 3, 4

B --> 1, 1, 2

Would result in

Result --> 1, 1, 2

EDIT

I think this link explains well what I want. This other link is also intersting to understand the question. Or this other link explains event better.

EDIT 2

Suppose the tables:

Table A

╔════════╦════╦═══╦════╦════╗
║   A    ║ B  ║ C ║ D  ║ E  ║
╠════════╬════╬═══╬════╬════╣
║ Car    ║ 10 ║ 1 ║ OK ║ -1 ║
║ Car    ║ 10 ║ 1 ║ OK ║ -1 ║
║ Car    ║ 10 ║ 1 ║ OK ║ -1 ║
║ House  ║ 10 ║ 1 ║ NO ║ -5 ║
║ Monkey ║ 15 ║ 1 ║ OK ║ -1 ║
║ Dog    ║  3 ║ 1 ║ OK ║ -1 ║
╚════════╩════╩═══╩════╩════╝

Table B

╔═════╦════╦═══╦════╦════╗
║  A  ║ B  ║ C ║ D  ║ E  ║
╠═════╬════╬═══╬════╬════╣
║ Car ║ 10 ║ 1 ║ OK ║ -1 ║
║ Car ║ 10 ║ 1 ║ OK ║ -1 ║
║ Car ║ 15 ║ 1 ║ OK ║ -1 ║
║ Dog ║  3 ║ 1 ║ OK ║ -1 ║
╚═════╩════╩═══╩════╩════╝

The answer for intersect (select * from A INTERSECT select * from B) would be:

╔═════╦════╦═══╦════╦════╗
║  A  ║ B  ║ C ║ D  ║ E  ║
╠═════╬════╬═══╬════╬════╣
║ Car ║ 10 ║ 1 ║ OK ║ -1 ║
║ Dog ║  3 ║ 1 ║ OK ║ -1 ║
╚═════╩════╩═══╩════╩════╝

Because it takes only distinct values. What I want is taking common rows, just like:

╔═════╦════╦═══╦════╦════╗
║  A  ║ B  ║ C ║ D  ║ E  ║
╠═════╬════╬═══╬════╬════╣
║ Car ║ 10 ║ 1 ║ OK ║ -1 ║
║ Car ║ 10 ║ 1 ║ OK ║ -1 ║
║ Dog ║  3 ║ 1 ║ OK ║ -1 ║
╚═════╩════╩═══╩════╩════╝

Observe I don't need to know what I have to link (the connection is positional, just like INTERSECT). The ID would be something constructed using all columns (the link between table are all columns, based on their position).

like image 790
Nizam Avatar asked Sep 18 '14 20:09

Nizam


People also ask

Does INTERSECT work in SQL Server?

The SQL Server (Transact-SQL) INTERSECT operator is used to return the records that are in common between two SELECT statements or data sets. If a record exists in one query and not in the other, it will be omitted from the INTERSECT results. It is the intersection of the two SELECT statements.

What is the difference between UNION and INTERSECT in SQL?

The difference between UNION and INTERSECT is that UNION gets results from both queries and combines them, while INTERSECT gets results that only exist in both queries. So, if Query 1 returns records A and B, and Query 2 returns records B and C, UNION would return A, B and C. INTERSECT would only return B.

What is difference between inner join and INTERSECT in SQL?

They are very different, even in your case. The INNER JOIN will return duplicates, if id is duplicated in either table. INTERSECT removes duplicates. The INNER JOIN will never return NULL , but INTERSECT will return NULL .

What is difference between INTERSECT and minus in SQL?

INTERSECT compares the data between tables and returns only the rows of data that exist in both tables. MINUS compares the data between tables and returns the rows of data that exist only in the first table you specify.


1 Answers

In SQL Server, INTERSECT works on distinct rows only. If you want it to distinguish between duplicate rows, you will need to make the rows distinct. The only way to do so I can think of is to add another column and populate it with unique values per duplicate, but in such a way that the resulting rows would be matchable across different tables.

The problem, however, is that so far there is no universal syntax for that. For instance, you could use ROW_NUMBER() to enumerate every duplicate, but you would have to write out its PARTITION BY clause for every case individually: there is no PARTITION BY *, not in SQL Server at least.

Anyway, for the purpose of illustration, here is how the ROW_NUMBER method would look like:

SELECT
  A, B, C, D, E,
  ROW_NUMBER() OVER (PARTITION BY A, B, C, D, E ORDER BY (SELECT 1))
FROM
  dbo.A

INTERSECT

SELECT
  A, B, C, D, E,
  ROW_NUMBER() OVER (PARTITION BY A, B, C, D, E ORDER BY (SELECT 1))
FROM
  dbo.B
;

As written above, the query would also return an extra column, the row number column, in the output. If you wanted to suppress it, you would need to make the query more complex:

SELECT
  A, B, C, D, E
FROM
  (
    SELECT
      A, B, C, D, E,
      rn = ROW_NUMBER() OVER (PARTITION BY A, B, C, D, E ORDER BY (SELECT 1))
    FROM
      dbo.A

    INTERSECT

    SELECT
      A, B, C, D, E,
      rn = ROW_NUMBER() OVER (PARTITION BY A, B, C, D, E ORDER BY (SELECT 1))
    FROM
      dbo.B
  ) AS s
;

And just to clarify, when I said above there was no universal syntax, I meant you could not do it without resorting to dynamic SQL. With dynamic SQL, a great many things are possible but such a solution would be much more complex and, in my opinion, much less maintainable.

Again, to illustrate the point, this is an example of how you could solve it with dynamic SQL:

DECLARE
  @table1 sysname,
  @table2 sysname,
  @columns nvarchar(max),
  @sql nvarchar(max)
;

SET @table1 = 'dbo.A';
SET @table2 = 'dbo.B';

-- collecting the columns from one table only,
-- assuming the structures of both tables are identical
-- if the structures differ, declare and populate
-- @columns1 and @columns2 separately
SET @columns = STUFF(
  (
    SELECT
      N', ' + QUOTENAME(name)
    FROM
      sys.columns
    WHERE
      object_id = OBJECT_ID(@table1)
    FOR XML
      PATH (''), TYPE
  ).value('text()[1]', 'nvarchar(max)'),
  1,
  2,
  ''
);

SET @sql =
N'SELECT ' + @columns + N'
FROM
  (
    SELECT
      ' + @columns + N',
      ROW_NUMBER() OVER (PARTITION BY ' + @columns + N' ORDER BY (SELECT 1))
    FROM
      ' + @table1 + N'

    INTERSECT

    SELECT
      ' + @columns + N',
      ROW_NUMBER() OVER (PARTITION BY ' + @columns + N' ORDER BY (SELECT 1))
    FROM
      ' + @table2 + N'
  ) AS s
';

EXECUTE sp_executesql @sql;

You can probably see now what I meant by "much more complex" at least.

like image 137
Andriy M Avatar answered Sep 28 '22 18:09

Andriy M