Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to understand complex SQL statements?

Tags:

sql

database

Does anyone have a method to understand complex SQL statements? When reading structural / OO code there are usually layers of abstraction that help you break it down into manageable chunks. Often in SQL, though, it seems that you have to keep track of what's going on in multiple parts of a query all at the same time.

The impetus for this question is the SQL query discussed in this question about a complex join. After staring at the answer queries for a number of minutes I finally decided to step through the query using particular records to see what was going on. That was the only way I could think of to understand the query piece by piece.

Is there a better way to break a SQL query down into manageable pieces?

like image 315
Eric Ness Avatar asked Dec 18 '08 20:12

Eric Ness


People also ask

What are complex SQL statements?

Complex SQL is the use of SQL queries which go beyond the standard SQL of using the SELECT and WHERE commands. Complex SQL often involves using complex joins and sub-queries, where queries are nested in WHERE clauses. Complex queries frequently involve heavy use of AND and OR clauses.


1 Answers

When I look at a complex bit of SQL Code, this is what I do.

First, if it is an update or delete, I add code (if it isn't there and commented out) to make it a select. Never try an update or delete for the first time without seeing the results in a select first. If it is an update, I make sure the select shows the current value and what I will be setting it to in order to make sure that I'm getting the desired result.

Understanding the joins is critical to understanding complex SQL. For every join I ask myself why is this here? There are four basic reasons. You need a column for the select, you need a field for the where clause, you need the join as a bridge to a third table, or you need to join to the table to filter records (such as retrieving details on customer who have orders but not needing the order details, this can often be done better with an IF EXISTS where clause). If it is a left or right join (I tend to rewrite so everything is a left join which makes life simpler.), I consider whether an inner join would work. Why do I need a left join? If I don't know the answer, I will run it both ways and see what the difference is within the data. If there are derived tables, I will look at those first (running just that part of the select to see what the result is) to understand why it is there. If there are sub-queries, I will try to understand them and if they are slow will try to convert to a derived table instead as those are often much faster.

Next, I look at the where clauses. This is one place where a solid foundation in your particular database will come in handy. For instance, I know in my databases what occasions I might need to see only the mailing address and what occasions I might need to see other addresses. This helps me to know if something is missing from the where clause. Otherwise I consider each item in the where clause and figure out why it would need to be there, then I consider whether there is anything missing that should be there. After looking it over, I consider if I can make adjustments to make the query sargable.

I also consider any complex bits of the select list next. What does that case statement do? Why is there a subquery? What do those functions do? (I always look up the function code for any function I'm not already familiar with.) Why is there a distinct? Can it be gotten rid of by using a derived table or aggregate function and group by statements?

Finally and MOST important, I run the select and determine if the results look correct based on my knowledge of the business. If you don't understand your business, you won't know if the query is correct. Syntactically correct doesn't mean the right results. Often there is a part of your existing user interface that you can use as a guide to whether your results are correct. If I have a screen that shows the orders for a customer and I'm doing a report that includes the customer orders, I might spot check a few individual customers to make sure it is showing the right result.

If the current query is filtering incorrectly, I will remove bits of it to find out what is getting rid of the records I don't want or adding ones I don't want. Often you will find that the join is one to many and you need one to one (use a derived table in this case!) or you will find that some piece of information that you think you need in the where clause isn't true for all the data you need or that some piece of the where clause is missing. It helps to have all the fields in the where clause (if they weren't in the select already) in the select at the time you do this. It may even help to show all the fields from all the joined tables and really look at the data. When I do this, I often add a small bit to the where clause to grab just some of the records that I have that shouldn't be there rather than all the records.

One sneaky thing that will break a lot of queries is the where clause referencing a field in a table on the right side of a left join. That turns it into an inner join. If you really need a left join, you should add those kinds of conditions to the join itself.

like image 169
HLGEM Avatar answered Sep 22 '22 10:09

HLGEM