Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle huge result sets from database

I'm designing a multi-tiered database driven web application – SQL relational database, Java for the middle service tier, web for the UI. The language doesn't really matter.

The middle service tier performs the actual querying of the database. The UI simply asks for certain data and has no concept that it's backed by a database.

The question is how to handle large data sets? The UI asks for data but the results might be huge, possibly too big to fit in memory. For example, a street sign application might have a service layer of:

StreetSign getStreetSign(int identifier)
Collection<StreetSign> getStreetSigns(Street street)
Collection<StreetSign> getStreetSigns(LatLonBox box)

The UI layer asks to get all street signs meeting some criteria. Depending on the criteria, the result set might be huge. The UI layer might divide the results into separate pages (for a browser) or just present them all (serving up to Goolge Earth). The potentially huge result set could be a performance and resource problem (out of memory).

One solution is not to return fully loaded objects (StreetSign objects). Rather return some sort of result set or iterator that lazily loads each individual object.

Another solution is to change the service API to return a subset of the requested data:

Collection<StreetSign> getStreetSigns(LatLonBox box, int pageNumber, int resultsPerPage)

Of course the UI can still request a huge result set:

getStreetSigns(box, 1, 1000000000)

I'm curious what is the standard industry design pattern for this scenario?

like image 796
Steve Kuo Avatar asked Oct 23 '08 22:10

Steve Kuo


People also ask

What can be used to return multiple result sets?

In order to get multiple result sets working we need to drop to the ObjectContext API by using the IObjectContextAdapter interface. Once we have an ObjectContext then we can use the Translate method to translate the results of our stored procedure into entities that can be tracked and used in EF as normal.

Which method is used to retrieve the result set?

executeQuery method to retrieve a result set from a stored procedure call, if that stored procedure returns only one result set. If the stored procedure returns multiple result sets, you need to use the Statement. execute method.

How many result sets available with the JDBC?

There are two types of result sets namely, forward only and, bidirectional.

How do you find the result set size?

Simply iterate on ResultSet object and increment rowCount to obtain size of ResultSet object in java. rowCount = rs. getRow();


1 Answers

The very first question should be:

¿The user needs to, or is capable of, manage this amount of data?

Although the result set should be paged, if its potentially size is so huge, the answer will be "probably not", so the UI shouldn't try to show it.

I worked on J2EE projects on Health Care Systems, that deal with enormous amount of stored data, literally millions of patients, visits, forms, etc, and the general rule is not to show more than 100 or 200 rows for any user search, advising the user that those set of criteria produces more information that he can understand.

The way to implement this varies from one project to another, it is possible to force the UI to ask the service tier the size of a query before launching it, or it is possible to throw an Exception from the service tier if the result set grows too much (however this way couples the service tier with the limited implementation of an UI).

Be careful! This not means that every method on the service tier must throw an Exception if its result sizes more than 100, this general rule only applies to result sets that are shown to the user directly, that is a better reason to place the control in the UI instead on the service tier.

like image 138
RogueOne Avatar answered Sep 20 '22 01:09

RogueOne