I've been implementing MS Search Server 2010 and so far its really good. Im doing the search queries via their web service, but due to the inconsistent results, im thinking about caching the result instead. The site is a small intranet (500 employees), so it shouldnt be any problems, but im curious what approach you would take if it was a bigger site. I've googled abit, but havent really come over anything specific. So, a few questions: <ul> <li>What other approaches are there? And why are they better?</li> <li>How much does it cost to store a dataview of 400-500 rows? What sizes are feasible?</li> <li>Other points you should take into consideration.</li> </ul> Any input is welcome :)

You need to employ many techniques to pull this off successfully. First, you need some sort of persistence layer. If you are using a plain old website, then the user's session would be the most logical layer to use. If you are using web services (meaning session-less) and just making calls through a client, well then you still need some sort of application layer (sort of a shared session) for your services. Why? This layer will be home to your database result cache. Second, you need a way of caching your results in whatever container you are using (session or the application layer of web services). You can do this a couple of ways... If the query is something that any user can do, then a simple hash of the query will work, and you can share this stored result among other users. You probably still want some sort of GUID for the result, so that you can pass this around in your client application, but having a hash lookup from the queries to the results will be useful. If these queries are unique then you can just use the unique GUID for the query result and pass this along to the client application. This is so you can perform your caching functionality... The caching mechanism can incorporate some sort of fixed length buffer or queue... so that old results will automatically get cleaned out/removed as new ones are added. Then, if a query comes in that is a cache miss, it will get executed normally and added to the cache. Third, you are going to want some way to page your result object... the Iterator pattern works well here, though probably something simpler might work... like fetch X amount of results starting at point Y. However the Iterator pattern would be better as you could then remove your caching mechanism later and page directly from the database if you so desired. Fourth, you need some sort of pre-fetch mechanism (as others suggested). You should launch a thread that will do the full search, and in your main thread just do a quick search with the top X number of items. Hopefully by the time the user tries paging, the second thread will be finished and your full result will now be in the cache. If the result isn't ready, you can just incorporate some simple loading screen logic. This should get you some of the way... let me know if you want clarification/more details about any particular part. I'll leave you with some more tips... <ol> <li>You don't want to be sending the entire result to the client app (if you are using Ajax or something like an IPhone app). Why? Well because that is a huge waste. The user likely isn't going to page through all of the results... now you just sent over 2MB of result fields for nothing. </li> <li>Javascript is an awesome language but remember it is still a client side scripting language... you don't want to be slowing the user experience down too much by sending massive amounts of data for your Ajax client to handle. Just send the prefetched result your client and additional page results as the user pages.</li> <li>Abstraction abstraction abstraction... you want to abstract away the cache, the querying, the paging, the prefetching... as much of it as you can. Why? Well lets say you want to switch databases or you want to page directly from the database instead of using a result object in cache... well if you do it right this is much easier to change later on. Also, if using web services, many many other applications can make use of this logic later on.</li> </ol> Now, I probably suggested an over-engineered solution for what you need :). But, if you can pull this off using all the right techniques, you will learn a ton and have a very good base in case you want to extend functionality or reuse this code. Let me know if you have questions.

Storing search result for paging and sorting

Tags:

c#

search

asp.net

search-engine

I've been implementing MS Search Server 2010 and so far its really good. Im doing the search queries via their web service, but due to the inconsistent results, im thinking about caching the result instead.

The site is a small intranet (500 employees), so it shouldnt be any problems, but im curious what approach you would take if it was a bigger site.

I've googled abit, but havent really come over anything specific. So, a few questions:

What other approaches are there? And why are they better?
How much does it cost to store a dataview of 400-500 rows? What sizes are feasible?
Other points you should take into consideration.

Any input is welcome :)

491

asked Feb 15 '10 18:02

Mattias

1 Answers

You need to employ many techniques to pull this off successfully.

First, you need some sort of persistence layer. If you are using a plain old website, then the user's session would be the most logical layer to use. If you are using web services (meaning session-less) and just making calls through a client, well then you still need some sort of application layer (sort of a shared session) for your services. Why? This layer will be home to your database result cache.

Second, you need a way of caching your results in whatever container you are using (session or the application layer of web services). You can do this a couple of ways... If the query is something that any user can do, then a simple hash of the query will work, and you can share this stored result among other users. You probably still want some sort of GUID for the result, so that you can pass this around in your client application, but having a hash lookup from the queries to the results will be useful. If these queries are unique then you can just use the unique GUID for the query result and pass this along to the client application. This is so you can perform your caching functionality...

The caching mechanism can incorporate some sort of fixed length buffer or queue... so that old results will automatically get cleaned out/removed as new ones are added. Then, if a query comes in that is a cache miss, it will get executed normally and added to the cache.

Third, you are going to want some way to page your result object... the Iterator pattern works well here, though probably something simpler might work... like fetch X amount of results starting at point Y. However the Iterator pattern would be better as you could then remove your caching mechanism later and page directly from the database if you so desired.

Fourth, you need some sort of pre-fetch mechanism (as others suggested). You should launch a thread that will do the full search, and in your main thread just do a quick search with the top X number of items. Hopefully by the time the user tries paging, the second thread will be finished and your full result will now be in the cache. If the result isn't ready, you can just incorporate some simple loading screen logic.

This should get you some of the way... let me know if you want clarification/more details about any particular part.

I'll leave you with some more tips...

You don't want to be sending the entire result to the client app (if you are using Ajax or something like an IPhone app). Why? Well because that is a huge waste. The user likely isn't going to page through all of the results... now you just sent over 2MB of result fields for nothing.
Javascript is an awesome language but remember it is still a client side scripting language... you don't want to be slowing the user experience down too much by sending massive amounts of data for your Ajax client to handle. Just send the prefetched result your client and additional page results as the user pages.
Abstraction abstraction abstraction... you want to abstract away the cache, the querying, the paging, the prefetching... as much of it as you can. Why? Well lets say you want to switch databases or you want to page directly from the database instead of using a result object in cache... well if you do it right this is much easier to change later on. Also, if using web services, many many other applications can make use of this logic later on.

Now, I probably suggested an over-engineered solution for what you need :). But, if you can pull this off using all the right techniques, you will learn a ton and have a very good base in case you want to extend functionality or reuse this code.

Let me know if you have questions.

answered Sep 21 '22 18:09

Polaris878

Related questions
                            
                                Bind IConfiguration to C# Record Type
                            
                                Serialize Json object with "multi-type" property
                            
                                Unit testing IHttpModule
                            
                                How do you execute a stored procedure using Castle ActiveRecord?
                            
                                Is there a library for notification/alert in .NET?
                            
                                Can an App.Config be loaded from a string or memory stream?
                            
                                T4 template for NHibernate? - not Fluent NHibernate
                            
                                Calculating the probability of a token being spam in a Bayesian spam filter
                            
                                Change forecolor of disabled combobox
                            
                                ASMX webservice not returning JSON, can only POST using application/x-www-form-urlencoded contentType
                            
                                Call VB6 DLL from a multithreaded c# windows service application?
                            
                                Process.Start specify culture
                            
                                C# How to get SQL Server installation path programatically?
                            
                                Multi-line regex search in whole file
                            
                                DI: Handling Life of IDisposable Objects
                            
                                Get highlighted text from active window
                            
                                Create an interactive logon session
                            
                                DoDragDrop freezes WinForms app sometimes
                            
                                Entity Framework - The underlying provider failed on ConnectionString
                            
                                Getting stdout when p-invoking to unmanaged DLL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With