Best architecture for a 30 + hour query

Tags:

I have an interesting problem to solve. One of my clients has me developing a stock analysis program with close to 50 years of stock data for almost a thousand symbols. I've developed a series of filters that are applied on any given day to see if anything falls out for a trade.

We want to run this filter for each day of data we have for each stock. Basically your begin and end date type report. However it takes 6 minutes to filter each week for each symbol. We are figuring about 40 hours or so to run the report on our entire data set.

The overriding requirement is that my client is able to do anything in the application from any computer anywhere (he travels a lot), so we are browser based.

To solve this issue, I wrote an asychronous method that runs this report, however the application pool inactivity timer will kill the job. I don't want to have to start adjusting timeouts for the entire application to support this one report (we are going to do a lot of these as every stock scenerio will need to be run against our entire dataset for analysis before it gets used for active trading).

Does anyone have any general ideas or experiences with a web architecture that will support ultra-long asychronous processes?

Thanks

473

asked Jul 07 '10 21:07

Mike Malter

3 Answers

As a general suggestion I would recommend a standalone Windows Service, Console App or similar with very careful lifetime controls and logging, which would run constantly and check (poll) for 'jobs to process' in a database, then update the database with results and progress information.

It may not be the best way but I've used it before many times and it's reliable, scalable and has good performance.

Best to keep web requests to a minute or two maximum - they were never designed for heavy processing times. This way you can 'check in' on the job status every minute or so (using a Web Service).

If you have any questions of me or about the idea please post a comment & I will be happy to help, elaborate or suggest..

Hope that helps!

(Additional: I believe Windows Services are underused! All it takes is a quick base class or collection of reusable helper methods and you've got a logged, reliable, automatic, configurable, quick-to-implement process running under your control. Quick to prototype with too!)

171

answered Oct 06 '22 17:10

Kieren Johnstone

Is there any reason not to simply run a service in the background and archive individual resultsets to a read only results table as they are requested? Do you need to run the query in realtime? The app could retrieve pages of results as they get generated by the service.

answered Oct 06 '22 18:10

Mike Burton

It sounds like you are doing SQL queries directly against these data. Have you considered loading the data to e.g. SQL Server Analysis Services and setting up a cube with (for starters) time, stock and symbol dimensions? Depending on the nature of your queries, you may get into quite reasonable response times. Relational databases are good for online transaction processing (within certain load and response time parameters), but analytical work sometimes requires the methods and technologies of data warehouses instead. (Or, perhaps, associative databases... there are alternatives.)

However, considering Murphy, you'll probably have some long running queries. Do the data vary for different end users? If not, why not precompute answers? Nothing http based should take more than a minute to process, if at that -- at least not by design!

answered Oct 06 '22 18:10

Pontus Gagge

Related questions
                            
                                Problem running oracle script from command line using sqlplus
                            
                                SQL Data Type to store build versions
                            
                                Deleting many rows without locking them
                            
                                Error : Arithmetic overflow error converting numeric to data type varchar
                            
                                SQL grammar for SELECT MIN(DATE)
                            
                                MySQL create temporary table with auto_increment id and select query
                            
                                Select all records in which at least n-1 of n criteria has been matched
                            
                                %Rowtype equivalent in SQL Server
                            
                                How to get second largest or third largest entry from a table [duplicate]
                            
                                Mysql + php with special characters like '(Apostrophe) and " (Quotation mark)
                            
                                SQL drop table and re-create and keep data
                            
                                Grails sql queries
                            
                                SQL Server Conditional Order By
                            
                                SQL query for Inner Join with Select
                            
                                How do I update first record in a table using mysql?
                            
                                Nullable integer values from reader
                            
                                convert row_number() to int in sql server
                            
                                Disable AutoDetectChanges on Entity Framework Core
                            
                                What is the easiest way using T-SQL / MS-SQL to append a string to existing table cells?
                            
                                How to prevent deletion of the first row in table (PostgreSQL)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With