Is there a way to modularize SQL code so that is more readable and testable?
My SQL code often becomes a long complicated series of nested joins, inner joins, etc. that are hard to write and hard to debug. By contrast, in a procedural language like Javascript or Java, one would pinch off discrete elements as separate functions you would call by name.
Yes, one could write each as entirely separate queries, stored in the database, or as stored procedures, but often I don't want to change/clutter the database, just query it is fine, especially if the DBA doesn't wish to grant write permissions to all users.
For instance, conceptually a complex query might be easily described in pseudocode like this:
(getCustomerProfile)
left join
(getSummarizedCustomerTransactionHistory)
using (customerId)
left join
(getGeographicalSummaries)
using (region, vendor)
...
I realize that a lot is written on the topic from a theoretical vantage (a few links below), but I'm just looking for a way to make the code easier to write correctly, and easier to read once written. Perhaps just syntactic sugar to abstract the complexity from sight, if not from execution, that compiles down in the literal SQL I'm trying to not look at. By analogy...
And if the specific SQL flavor matters, most of my work is in PostgresQL.
http://lambda-the-ultimate.org/node/2440
Code reuse and modularity in SQL
Are Databases and Functional Programming at odds?
In most databases, you can do what you want using CTEs (Common Table Expressions):
with CustomerProfile as (
getCustomerProfile
),
SummarizedCustomerTransactionHistory as (
getSummarizedCustomerTransactionHistory
),
GeographicalSummaries as (
getGeographicalSummaries
)
select <whatever>
This works for a single query. It has the advantage that you can define a CTE once, but use it multiple times. Also, I often define a CTE called const
that has constant values.
The next step is to take these constructs and create views from them. This is especially useful when sharing code among multiple modules, to ensure constant definitions. In some databases, you can put indexes on the views to "instantiate" them, further optimizing processing.
Finally, I recommend wrapping inserts/updates/deletes in stored procedures. This allows you to do have a consistent framework.
Two more comments though. First, SQL is often used for transactional or reporting systems. Often, once you get the data in the right format for the purpose, the data speaks for itself. You example might just be asking for a data mart that has three tables devoted to those three subject areas, which get populated once per week or once per day.
And, SQL is not an idea language for abstraction. With good practice, naming conventions, and indentation style, you can make it useful. I sorely miss certain things from "real" languages, such as macros, error handling (why data errors are so hard to identify and handle is beyond me), consistent methods for common functionality (can someone say group string concatenation), and some other features. That said, because it is data centric and readily parallelizable, it is more useful for me than most other languages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With