Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does "SELECT DISTINCT a, b FROM..." return fewer records than "SELECT DISTINCT A + '|' + B FROM..."?

I have a query that's selecting a bunch of fields related to names and addresses of customers but it boils down to:

SELECT DISTINCT a, b, c, ... FROM big_dumb_flat_table

it returns a bunch of records (10986590). When I replace the commas in the select-list to format it as a pipe-separated concatenated string:

SELECT DISTINCT a + '|' + b + '|' + c + '|' + ... FROM big_dumb_flat_table

it's returning 248 more records. I've reassured myself that there are no pipes in any of the fields that could be screwing the fidelity of the returned set. What's going on here?

like image 363
clweeks Avatar asked Aug 16 '10 13:08

clweeks


2 Answers

Trailing spaces could cause this. For string comparisons these are ignored.

CREATE TABLE #T
(
a varchar(10),
b varchar(10),
c varchar(10)
)

INSERT INTO #T
SELECT 'a ' as a, 'b' as b, 'c ' as c union all
SELECT 'a' as a, 'b' as b, 'c ' as c

SELECT DISTINCT a, b, c  
FROM #T /*1 result*/

SELECT DISTINCT a + '|' + b + '|' + c + '|'   
FROM #T /*2 results*/


SELECT DISTINCT LTRIM(RTRIM(a)) + '|' + LTRIM(RTRIM(b)) + '|' +
                LTRIM(RTRIM(c)) + '|'   
FROM #T /*1 result*/
like image 141
Martin Smith Avatar answered Nov 09 '22 22:11

Martin Smith


The really aren't any scenarios that I can think of that would get you MORE records, only fewer. I would simplify the query by only selecting a + '|', then add more columns as you go.

like image 20
bzarah Avatar answered Nov 09 '22 23:11

bzarah