We have an analytics chart which first queries the database log table to pull all relevant from an optimized query only selecting what is needed along with grabbing relevant start and end ID's since this table has millions of records. Once that initial query is pulled, we have used ColdFusion's query of queries to work with that data to display different charts.
You can see the actual external database call in this example grabs 2,240 records in 31ms:
qryGetLogs (Datasource=ourDSN, Time=31ms, Records=2204)
We have a chart that displays views per hour for each day of the week and then builds a jQuery chart to display them. From the initial design, those queries execution times were almost negligible, most often at 0ms. Because we loop over each hour in the day (24) for 7 days a week, that is 168 queries - one of the main reasons to not make the external DB call so many times.
It seems now that many, but not all, of those query of queries are taking over 100 times longer to run than the initial database call. Most of them are using the BETWEEN date range function to select the records for each day and hour part:
qryViewsPerHour (Datasource=, Time=4312ms, Records=5)
SELECT createdOn, DayOfWeek
FROM qryGetLogs
WHERE (CreatedOn BETWEEN '2012-09-03 0:00:00' AND '2012-09-03 0:59:59')
AND (DayOfWeek = 2)
You can see that query of another query took 4,312ms and was searching through a query with 2,240 records. Here are the query times for many of the next queries:
qryViewsPerHour (Datasource=, Time=4610ms, Records=5)
qryViewsPerHour (Datasource=, Time=4187ms, Records=8)
qryViewsPerHour (Datasource=, Time=5062ms, Records=6)
qryViewsPerHour (Datasource=, Time=3985ms, Records=0)
qryViewsPerHour (Datasource=, Time=4828ms, Records=2)
qryViewsPerHour (Datasource=, Time=5750ms, Records=0)
qryViewsPerHour (Datasource=, Time=3016ms, Records=4)
qryViewsPerHour (Datasource=, Time=3625ms, Records=6)
qryViewsPerHour (Datasource=, Time=6265ms, Records=11)
So you can see just by those queries about, it added 40 seconds of loading time! But notice the next query is just 78ms with more records than any of the previous query, and many after that, the times are better:
qryViewsPerHour (Datasource=, Time=78ms, Records=18)
qryViewsPerHour (Datasource=, Time=62ms, Records=7)
qryViewsPerHour (Datasource=, Time=63ms, Records=12)
qryViewsPerHour (Datasource=, Time=78ms, Records=34)
qryViewsPerHour (Datasource=, Time=78ms, Records=9)
Those go on for a while with good times, then BAM! Back to the 2-6 second queries.
qryViewsPerHour (Datasource=, Time=4891ms, Records=13)
qryViewsPerHour (Datasource=, Time=1984ms, Records=8)
qryViewsPerHour (Datasource=, Time=4875ms, Records=4)
qryViewsPerHour (Datasource=, Time=6203ms, Records=0)
All in all, what was taking several seconds to load, is taking between 100-400 seconds loads, and that is just the weekly report! We also use it for monthly reports.
I have monitored the server and made sure that I am the only person or process running the request, so it should not be that the CPU's resources are being eaten up by something else, and I also monitor the CPU request and it's solid and steady being used by JRUN.exe.
Does anyone have any advice on this problem? It is driving me nuts!
Thanks for any help.
Typically, query of queries are wonderfully fast. However, there are times it may take longer when volumes increase. Since you are well within the recommended 5,000 to 50,000 row recommendation by ColdFusion documentation, it can only be one thing. Some where some data is being translated to get the results. In this case, it's the date time that you've specified. The time value has one leading zero. I know it sounds strange.
CHANGE: WHERE (CreatedOn BETWEEN '2012-09-03 0:00:00' AND '2012-09-03 0:59:59')
TO: WHERE (CreatedOn BETWEEN '2012-09-03 00:00:00' AND '2012-09-03 00:59:59')
I tested this and found the results were better.
Although Coldfusion has the functionaility of performing a query of queries in my experience it is a last resort. Reason being how inefficient and limited it is. CF is a programming language, not a SQL language and as such its doesn't not have any of the optimization features a real DB would have.
I would strongly reccomend pushing the queries onto the database instead and if you have a DB that supports views I would also recommend creating a view based on your original query so you only need to perform the subqueries on the view not the whole original tables.
Avoid Query of queries where possible, especially when working with large datasets because it has to all be stored in memory as hefty Java objects. and as Russ pointed out, is subject to the garbage collectors influence and as your reports grow could lead to out of memory problems. (I once knocked over my PC with an oversized QoQ)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With