Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort a list with SQL or as a collection?

I have some entries with dates in my database. What is best?:

  • Fetch them with a sql statement and also apply order by.
  • Get the list with sql, and order them within the application with collection.sort or so?

Thanks

like image 793
membersound Avatar asked Oct 28 '12 10:10

membersound


People also ask

How do I sort a list in SQL?

The SQL ORDER BY Keyword The ORDER BY keyword is used to sort the result-set in ascending or descending order. The ORDER BY keyword sorts the records in ascending order by default. To sort the records in descending order, use the DESC keyword.

What is the difference between sort by and GROUP BY?

Group by statement is used to group the rows that have the same value. Whereas Order by statement sort the result-set either in ascending or in descending order. 2.

How do I sort by group in SQL?

The general syntax for selecting values and sorting them is: SELECT column_name, column_name FROM table_name ORDER BY column_name ASC, column_name DESC; Note that ORDER BY() will automatically sort the returned values in ascending order so the use of the ASC keyword is optional.

Does collections sort work on set?

Most (but not all) Set implementations do not have a concept of order, so Collections. sort does not support them as a whole. If you want a set with a concept of order, you can use something like a TreeSet : A NavigableSet implementation based on a TreeMap .


2 Answers

This a very broad question that is very difficult to answer, and it depends a lot on what you mean by best?

  • From a performance perspective, you will simply have to measure to determine what part of your system is the bottleneck. Databases are usually very efficient, but it could still be relevant to off-load that work to the client.

  • From a separation of concern perspective, it depends on how the sorting matters in the application and how the application is layered.

Ask your self: "where does the knowledge that the data is sorted belong?" and "What would happen if I where to change from a relational database storage to something different".

like image 156
Anders Lindahl Avatar answered Oct 05 '22 22:10

Anders Lindahl


To some extent, it depends on how many values are in the complete collection. If it is, say, 20-30 values then you can sort anywhere — even a relatively poor sorting algorithm can do that quickly (avoid Stooge Sort though; that's terrible) — as that is the sort of size of data chunk which you might expect to actually fetch in one service response.

But once you get into larger datasets you need to plan much more carefully. In particular, you want to avoid moving data around if you don't have to. If the data is currently only present in the database, you really don't want to fetch it all into the client just to sort it (a relatively expensive operation) and then throw virtually all of it away. It's far better to actually keep the data sorted in the database to start with, so that picking it up in order is trivial; in relational database terms, keeping the data sorted is functionally identical to maintaining an index on the data. Indeed, you can have multiple indices on the data, which can make even rather complex queries quick. (NoSQL DBs are more varied; some even don't support the concept of keeping data sorted.) The downside of maintaining indices is that they take up more space and they take time to maintain, particularly when the data is being created in the first place.

So… to return to your question, you probably want to try to not sort the data in the application: for most data, an appropriate index can be much more efficient as it lets your code not even look at unwanted data. But if you have to fetch it all into your application for some other reason and you can't bring it in pre-sorted, there's no reason to avoid sorting it yourself: Java's sorting algorithms are efficient and stable. But you should measure whether fetching it from the DB in the new order is faster. (The question is whether the DB overheads exceed the super-linear costs of re-sorting; lots of problems are in the domain where “maybe; hard to tell” is the answer.)

The other thing to balance is whether it is simpler for your code to not do sorting itself and instead always delegate that to the DB. Keeping your code simpler (and more bug-free) is a good goal to have…

like image 38
Donal Fellows Avatar answered Oct 05 '22 23:10

Donal Fellows