Should I use distinct in my queries

Tags:

Where I am working I have been recently told that using distinct in your queries is a bad sign of a programmer. So I am wondering I guess the only way to not use this function is to use a group by .

It was my understanding that the distinct function works very similarly to a group by except in how its read. A distinct function checks each individual selection criteria vs a group by which does the same thing only done as a whole.

Keep in mind I only do reporting . I do not create/alter the data. So my question is for best practices should I be using distinct or group by. If neither then is there an alternative. Maybe the group by should be used in more complex queries than my non-real example here, but you get the idea. I could not find an answer that really explained why or why not I should use distinct in my queries

select distinct
    spriden_user_id as "ID",
    spriden_last_name as "last",
    spriden_first_name as "first",
    spriden_mi_name as "MI",
    spraddr_street_line1 as "Street",
    spraddr_street_line2 as "Street2",
    spraddr_city as "city",
    spraddr_stat_code as "State",
    spraddr_zip as "zip"
from spriden, spraddr
where spriden_user_id = spraddr_id
and spraddr_mail_type = 'MA'

select
    spriden_user_id as "ID",
    spriden_last_name as "last",
    spriden_first_name as "first",
    spriden_mi_name as "MI",
    spraddr_street_line1 as "Street",
    spraddr_street_line2 as "Street2",
    spraddr_city as "city",
    spraddr_stat_code as "State",
    spraddr_zip as "zip"
from spriden, spraddr
where spriden_user_id = spraddr_id
and spraddr_mail_type = 'MA'
group by "ID","last","first","MI","Street","Street2","city","State","zip"

768

asked Nov 11 '15 13:11

Taku_

2 Answers

Databases are smart to recognize what you mean. I expect both of your queries to perform equally well. It is important for someone else maintaining your query to know what you meant. If you really meant to retrieve distinct records, use DISTINCT. If your intention was to do aggregation, use GROUP BY

Take a look at this question. There are some nice answers that might help.

145

answered Oct 14 '22 19:10

zedfoxus

The answer provided by @zedfoxus is useful to understand the context.

However, I don't believe your query should require distinct records if the data is designed correctly.

It appears you are selecting the primary key of table spriden, so all that data should be unique. You're also joining onto the spraddr table; does that table really contain valid duplicate data? Or is there perhaps an additional join criterium that's required to filter out those duplicates?

This is why I get nervous about use of "distinct" - the spraddr table may include additional columns which you should use to filter out data, and "distinct" may be hiding that.

Also, you may be generating a massive result set which needs to be filtered by the "distinct" clause, which can cause performance issues. For instance, if there are 1 million rows in spraddr for each row in spriden, and you should use the "is_current" flag to find the 2 or 3 "real" ones.

Finally, I get nervous when I see "group by" used as a substitute for distinct, not because it's "wrong", but because stylistically, I believe group by should be used for aggregate functions. That's just a personal preference.

answered Oct 14 '22 21:10

Neville Kuyt

Related questions
                            
                                Javascript: Zoom in on mouseover WITHOUT Jquery or plugins
                            
                                XML Schema. Base64binary type vs String type
                            
                                Shortcut to Uppercase Selected Text in Oracle SQL Developer
                            
                                What is best way to schedule task in spring boot application
                            
                                Convert base64 string to file
                            
                                How to install sbt on ubuntu/debian with apt-get [closed]
                            
                                EventEmitter and Subscriber ES6 Syntax with React Native
                            
                                Are multiple roles allowed in the @Secured annotation with 'or' condition in Spring Security
                            
                                how to generate md5 hash in angular 2 typescript?
                            
                                Disable QuickEdit in Windows 10 cmd.exe [closed]
                            
                                Specify a line-break for responsive design
                            
                                Project 'Default' not found error when installing a NuGet package

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With