Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning Large Results Via a Webservice

I'm working on a web service at the moment and there is the potential that the returned results could be quite large ( > 5mb).

It's perfectly valid for this set of data to be this large and the web service can be called either sync or async, but I'm wondering what people's thoughts are on the following:

  1. If the connection is lost, the entire resultset will have to be regenerated and sent again. Is there any way I can do any sort of "resume" if the connection is lost or reset?

  2. Is sending a result set this large even appropriate? Would it be better to implement some sort of "paging" where the resultset is generated and stored on the server and the client can then download chunks of the resultset in smaller amounts and re-assemble the set at their end?

like image 622
lomaxx Avatar asked Aug 15 '08 00:08

lomaxx


1 Answers

I have seen all three approaches, paged, store and retrieve, and massive push.

I think the solution to your problem depends to some extent on why your result set is so large and how it is generated. Do your results grow over time, are they calculated all at once and then pushed, do you want to stream them back as soon as you have them?

Paging Approach

In my experience, using a paging approach is appropriate when the client needs quick access to reasonably sized chunks of the result set similar to pages in search results. Considerations here are overall chattiness of your protocol, caching of the entire result set between client page requests, and/or the processing time it takes to generate a page of results.

Store and retrieve

Store and retrieve is useful when the results are not random access and the result set grows in size as the query is processed. Issues to consider here are complexity for clients and if you can provide the user with partial results or if you need to calculate all results before returning anything to the client (think sorting of results from distributed search engines).

Massive Push

The massive push approach is almost certainly flawed. Even if the client needs all of the information and it needs to be pushed in a monolithic result set, I would recommend taking the approach of WS-ReliableMessaging (either directly or through your own simplified version) and chunking your results. By doing this you

  1. ensure that the pieces reach the client
  2. can discard the chunk as soon as you get a receipt from the client
  3. can reduce the possible issues with memory consumption from having to retain 5MB of XML, DOM, or whatever in memory (assuming that you aren't processing the results in a streaming manner) on the server and client sides.

Like others have said though, don't do anything until you know your result set size, how it is generated, and overall performance to be actual issues.

like image 81
DavidValeri Avatar answered Sep 18 '22 19:09

DavidValeri