The power and expressivity of SQL is heavily underestimated. Since the introduction of window functions, a lot of non-strictly set-oriented calculations can be performed very easily and elegantly in the database.
Field calculations can be performed in a database by adding a formula field. Field calculations allow you to perform addition, subtraction, multiplication, and division operations on any numeric fields in your database.
As an example, Excel cells let you store strings of information like “January 3, 2020,” whereas relational databases using SQL would divide the date into three separate cells. Excel is more visual than SQL—instead of writing commands, users get to directly work with and manipulate their data in tables.
It depends on a lot of factors - but most crucially:
As always, if you do bring the data back to the app-server, minimising the columns and rows will be to your advantage. Making sure the query is tuned and appropriately indexed will help either scenario.
Re your note:
and then loop through the records
Looping through records is almost always the wrong thing to do in sql - writing a set-based operation is preferred.
As a general rule, I prefer to keep the database's job to a minimum "store this data, fetch this data" - however, there are always examples of scenarios where an elegant query at the server can save a lot of bandwidth.
Also consider: if this is computationally expensive, can it be cached somewhere?
If you want an accurate "which is better"; code it both ways and compare it (noting that a first draft of either is likely not 100% tuned). But factor in typical usage to that: if, in reality, it is being called 5 times (separately) at once, then simulate that: don't compare just a single "1 of these vs 1 of those".
Let me use a metaphor: if you want to buy a golden necklace in Paris, the goldsmith could sit in Cape Town or Paris, that is a matter of skill and taste. But you would never ship tons of gold ore from South Africa to France for that. The ore is processed at the mining site (or at least in the general area), only the gold gets shipped. The same should be true for apps and databases.
As far as PostgreSQL is concerned, you can do almost anything on the server, quite efficiently. The RDBMS excels at complex queries. For procedural needs you can choose from a variety of server-side script languages: tcl, python, perl and many more. Mostly I use PL/pgSQL, though.
Worst case scenario would be to repeatedly go to the server for every single row of a larger set. (That would be like shipping one ton of ore a time.)
Second in line, if you send a cascade of queries, each depending on the one before, while all of it could be done in one query or procedure on the server. (That's like shipping the gold, and each of the jewels with a separate ship, sequentially.)
Going back and forth between app and server is expensive. For server and client. Try to cut down on that, and you will win - ergo: use server side procedures and / or sophisticated SQL where necessary.
We just finished a project where we packed almost all complex queries into Postgres functions. The app hands over parameters and gets the datasets it needs. Fast, clean, simple (for the app developer), I/O reduced to a minimum ... a shiny necklace with a low carbon footprint.
In this case you are probably slightly better off doing the calculation in SQL as the database engine is likely to have a more efficient decimal arithmetic routines than Java.
Generally though for row level calculations there is not much difference.
Where it does make a difference is:
There's no black / white with respect to what parts of data access logic should be performed in SQL and what parts should be performed in your application. I like Mark Gravell's wording, distinguishing between
The power and expressivity of SQL is heavily underestimated. Since the introduction of window functions, a lot of non-strictly set-oriented calculations can be performed very easily and elegantly in the database.
Three rules of thumb should always be followed, regardless of the overall application architecture:
In my experience, with a decent DBA and some decent knowledge about your decent database, you won't run into your DBs CPU limits very soon.
Some further reading where these things are explained:
In general do things in SQL if there are chances that also other modules or component in same or other projects will need to get those results. an atomic operation done server side is also better because you just need to invoke the stored proc from any db management tool to get final values without further processing.
In some cases this does not apply but when it does it makes sense. also in general the db box has the best hardware and performances.
Whether to perform calculations at the front end or at the backend is very much decided if we can determine our goal in the business implementation. At time java code might perform better than a sql code both well written or it might be vice-versa. But still if confused you can try to determine first -
There are many other aspects which you can think before you decide where to place the code. One perception is totally wrong - Everything can be done best in Java (app code) and/or everything is best to be done by the db (sql code).
If you are writing on top of ORM or writing casual low-performance applications, use whatever pattern simplifies the application. If you are writing a high performance application and thinking carefully about scale, you will win by moving processing to data. I strongly advocate moving the processing to the data.
Let's think about this in two steps: (1) OLTP (small number of record) transactions. (2) OLAP (long scans of many records).
In the OLTP case, if you want to be fast (10k - 100k transactions per second), you must remove latch, lock and dead lock contention from the database. This means that you need to eliminate long stalls in transactions: round trips from client to DB to move processing to the client are one such long stall. You can't have long lived transactions (to make read/update atomic) and have very high throughput.
Re: horizontal scaling. Modern databases scale horizontally. Those systems implement HA and fault tolerance already. Leverage that and try to simplify your application space.
Let's look at OLAP -- in this case it should be obvious that dragging possibly terrabytes of data back to the application is a horrible idea. These systems are built specifically to operate extremely efficiently against compressed, pre-organized columnar data. Modern OLAP systems also scale horizontally and have sophisticated query planners that disperse work horizontally (internally moving processing to data).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With