Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

3-tiers pattern and large amounts of data

Here is my situation: I am trying to follow as hard as I can the 3-tier pattern (i.e. Presentation, Business and Data layer). When I need data from the DB, the Business layer calls the Data layer which returns the information. The Data layer never return a SqlDataReader or DataTable object, but often an enumeration of custom object known by the Data Access Layer. It works pretty well when the Data layer has to return a list with few objects.

I am now facing this problem, my application (the business layer) must process 500000 records. I could simply add another method to my Data layer and return an IEnumerable but this sound very bad to me. I don't want to load half-million records in memory.

My question is, considering the 3-tier model, how should I handle this case? If I had no 3-tiers pattern, I would simply use SqlDataReader in my business classes. Any suggestions?

UPDATE: The data will not be displayed, so this is not a paging issue (the presentation layer is not involved at all here). I simply have to analyze each record and then keep some of them.

Thanks

like image 334
Martin Avatar asked May 25 '09 15:05

Martin


5 Answers

I assume you're not displaying 500,000 records to the front end at once? You're probably doing some pagination, right? So, only return one page worth of data from the database at one time.

like image 73
Matt Grande Avatar answered Oct 21 '22 15:10

Matt Grande


Yes, your instinct is correct.

I'm betting that your UI client does not want to look at half a million records at once. Google doesn't return every hit in a single page; you won't, either.

You have a choice as to where and when your application processes those half a million records. You can chunk them into smaller units of work; you can process them asynchronously; you can write a stored procedure and process them in the database without bringing them all over to the middle tier.

The MVC pattern is wonderful, but it's not holy writ. Make the choices that work for your app.

like image 23
duffymo Avatar answered Oct 21 '22 15:10

duffymo


A piece of paper can never trump reality. If your specific problem asks to break the 3-tier paradigm, do it.

like image 31
Aaron Digulla Avatar answered Oct 21 '22 15:10

Aaron Digulla


In some cases, you have to break the 3-tier boundaries. But before you do, you could ask yourself:

  1. When you "analyze each record and then keep some of them," is that really part of the business logic? Or is it a data-access function? It might be the case that this belongs in the data access layer.

  2. If it is part of the business logic, do you need all 500000 records in order to make a decision about whether to "keep" any individual record? It might be that the business layer should be processing one record at a time. Making 500000 consecutive database calls is not pretty, but if that is what the app should be doing from a conceptual standpoint, there are ways to mitigate that.

I don't recommend doing anything dumb just to keep the 3 tiers separate. But sometimes, when you think you have to cross the line, it's because there is something in the design that needs a second look.

--
bmb

like image 1
bmb Avatar answered Oct 21 '22 16:10

bmb


You can build a abstraction on top of the SqlReader class. That way you don't have to pass the SqlReader directly, but you can still process the objects one at a time.

Think Iterators.

like image 1
Peter Stuifzand Avatar answered Oct 21 '22 15:10

Peter Stuifzand