Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing queries for the next and previous element

I am looking for the best way to retrieve the next and previous records of a record without running a full query. I have a fully implemented solution in place, and would like to know whether there are any better approaches to do this out there.

Let's say we are building a web site for a fictitious greengrocer. In addition to his HTML pages, every week, he wants to publish a list of special offers on his site. He wants those offers to reside in an actual database table, and users have to be able to sort the offers in three ways.

Every item also has to have a detail page with more, textual information on the offer and "previous" and "next" buttons. The "previous" and "next" buttons need to point to the neighboring entries depending on the sorting the user had chosen for the list.

alt text
(source: pekkagaiser.com)

Obviously, the "next" button for "Tomatoes, Class I" has to be "Apples, class 1" in the first example, "Pears, class I" in the second, and none in the third.

The task in the detail view is to determine the next and previous items without running a query every time, with the sort order of the list as the only available information (Let's say we get that through a GET parameter ?sort=offeroftheweek_price, and ignore the security implications).

Obviously, simply passing the IDs of the next and previous elements as a parameter is the first solution that comes to mind. After all, we already know the ID's at this point. But, this is not an option here - it would work in this simplified example, but not in many of my real world use cases.

My current approach in my CMS is using something I have named "sorting cache". When a list is loaded, I store the item positions in records in a table named sortingcache.

name (VARCHAR)             items (TEXT)  offeroftheweek_unsorted    Lettuce; Tomatoes; Apples I; Apples II; Pears offeroftheweek_price       Tomatoes;Pears;Apples I; Apples II; Lettuce offeroftheweek_class_asc   Apples II;Lettuce;Apples;Pears;Tomatoes 

obviously, the items column is really populated with numeric IDs.

In the detail page, I now access the appropriate sortingcache record, fetch the items column, explode it, search for the current item ID, and return the previous and next neighbour.

array("current"   => "Tomatoes",       "next"      => "Pears",       "previous"  => null       ); 

This is obviously expensive, works for a limited number of records only and creates redundant data, but let's assume that in the real world, the query to create the lists is very expensive (it is), running it in every detail view is out of the question, and some caching is needed.

My questions:

  • Do you think this is a good practice to find out the neighbouring records for varying query orders?

  • Do you know better practices in terms of performance and simplicity? Do you know something that makes this completely obsolete?

  • In programming theory, is there a name for this problem?

  • Is the name "Sorting cache" is appropriate and understandable for this technique?

  • Are there any recognized, common patterns to solve this problem? What are they called?

Note: My question is not about building the list, or how to display the detail view. Those are just examples. My question is the basic functionality of determining the neighbors of a record when a re-query is impossible, and the fastest and cheapest way to get there.

If something is unclear, please leave a comment and I will clarify.

Starting a bounty - maybe there is some more info on this out there.

like image 796
Pekka Avatar asked Feb 22 '10 11:02

Pekka


2 Answers

Here is an idea. You could offload the expensive operations to an update when the grocer inserts/updates new offers rather than when the end user selects the data to view. This may seem like a non-dynamic way to handle the sort data, but it may increase speed. And, as we know, there is always a trade off between performance and other coding factors.

Create a table to hold next and previous for each offer and each sort option. (Alternatively, you could store this in the offer table if you will always have three sort options -- query speed is a good reason to denormalize your database)

So you would have these columns:

  • Sort Type (Unsorted, Price, Class and Price Desc)
  • Offer ID
  • Prev ID
  • Next ID

When the detail information for the offer detail page is queried from the database, the NextID and PrevID would be part of the results. So you would only need one query for each detail page.

Each time an offer is inserted, updated or deleted, you would need to run a process which validates the integrity/accuracy of the sorttype table.

like image 161
Jessica Avatar answered Oct 07 '22 02:10

Jessica


I have an idea somewhat similar to Jessica's. However, instead of storing links to the next and previous sort items, you store the sort order for each sort type. To find the previous or next record, just get the row with SortX=currentSort++ or SortX=currentSort--.

Example:

Type     Class Price Sort1  Sort2 Sort3 Lettuce  2     0.89  0      4     0 Tomatoes 1     1.50  1      0     4 Apples   1     1.10  2      2     2 Apples   2     0.95  3      3     1 Pears    1     1.25  4      1     3 

This solution would yield very short query times, and would take up less disk space than Jessica's idea. However, as I'm sure you realize, the cost of updating one row of data is notably higher, since you have to recalculate and store all sort orders. But still, depending on your situation, if data updates are rare and especially if they always happen in bulk, then this solution might be the best.

i.e.

once_per_day   add/delete/update all records   recalculate sort orders 

Hope this is useful.

like image 39
Adukra Avatar answered Oct 07 '22 01:10

Adukra