I have a query that's selecting a bunch of fields related to names and addresses of customers but it boils down to:
SELECT DISTINCT a, b, c, ... FROM big_dumb_flat_table
it returns a bunch of records (10986590). When I replace the commas in the select-list to format it as a pipe-separated concatenated string:
SELECT DISTINCT a + '|' + b + '|' + c + '|' + ... FROM big_dumb_flat_table
it's returning 248 more records. I've reassured myself that there are no pipes in any of the fields that could be screwing the fidelity of the returned set. What's going on here?
Trailing spaces could cause this. For string comparisons these are ignored.
CREATE TABLE #T
(
a varchar(10),
b varchar(10),
c varchar(10)
)
INSERT INTO #T
SELECT 'a ' as a, 'b' as b, 'c ' as c union all
SELECT 'a' as a, 'b' as b, 'c ' as c
SELECT DISTINCT a, b, c
FROM #T /*1 result*/
SELECT DISTINCT a + '|' + b + '|' + c + '|'
FROM #T /*2 results*/
SELECT DISTINCT LTRIM(RTRIM(a)) + '|' + LTRIM(RTRIM(b)) + '|' +
LTRIM(RTRIM(c)) + '|'
FROM #T /*1 result*/
The really aren't any scenarios that I can think of that would get you MORE records, only fewer. I would simplify the query by only selecting a + '|', then add more columns as you go.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With