Query Cost vs. Execution Speed + Parallelism

Tags:

My department was recently reprimanded (nicely) by our IT department for running queries with very high costs on the premise that our queries have a real possibility of destabilizing and/or crashing the database. None of us are DBA's; were just researchers who write and execute queries against the database, and I'm probably the only one who ever looked at an explain plan before the reprimand.

We were told that query costs over 100 should be very rare, and queries with costs over 1000 should never be run. The problems I am running into are that cost seems have no correlation with execution time, and I'm losing productivity while trying to optimize my queries.

As an example, I have a query that executes in under 5 seconds with a cost of 10844. I rewrote the query to use a view that contains most of the information I need, and got the cost down to 109, but the new query, which retrieves the same results, takes 40 seconds to run. I found a question here with a possible explanation:

Measuring Query Performance : "Execution Plan Query Cost" vs "Time Taken"

That question led me to parallelism hints. I tried using /*+ no_parallel*/ in the cost 10884 query, but the cost did not change, nor did the execution time, so I'm not sure that parallelism is the explanation for the faster execution time but higher cost. Then, I tried using the /*+ parallel(n)*/ hint, and found that the higher the value of n, the lower the cost of the query. In the case of cost 10844 query, I found that /*+ parallel(140)*/ dropped the cost to 97, with a very minor increase in execution time.

This seemed like an ideal "cheat" to meet the requirements that our IT department set forth, but then I read this:

http://www.oracle.com/technetwork/articles/datawarehouse/twp-parallel-execution-fundamentals-133639.pdf

The article contains this sentence:

Parallel execution can enable a single operation to utilize all system resources.

So, my questions are:

Am I actually placing more strain on the server resources by using the /*+ parallel(n)*/ hint with a very high degree of parallelism, even though I am lowering the cost?

Assuming no parallelism, is execution speed a better measure of resources used than cost?

759

asked Sep 03 '14 20:09

anbisme

1 Answers

The rule your DBA gave you doesn't make a lot of sense. Worrying about the cost that is reported for a query is very seldom productive. First, you cannot directly compare the cost of two different queries-- one query that has a cost in the millions may run very quickly and consume very few system resources another query that has a cost in the hundreds may run for hours and bring the server to its knees. Second, cost is an estimate. If the optimizer made an accurate estimate of the cost, that strongly implies that it has come up with the optimal query plan which would mean that it is unlikely that you'd be able to modify the query to return the same results while using fewer resources. If the optimizer made an inaccurate estimate of the cost, that strongly implies that it has come up with a poor query plan in which case the reported cost would have no relationship to any useful metric you'd come up with. Most of the time, the queries you're trying to optimize are the queries where the optimizer generated an incorrect query plan because it incorrectly estimated the cost of various steps.

Tricking the optimizer by using hints that may or may not actually change the query plan (depending on how parallelism is configured, for example) is very unlikely to solve a problem-- it's much more likely to cause the optimizer's estimates to be less accurate and make it more likely that it chooses a query plan that consumes far more resources than it needs to. A parallel hint with a high degree of parallelism, for example, would tell Oracle to drastically reduce the cost of a full table scan which makes it more likely that the optimizer would choose that over an index scan. That is seldom something that your DBAs would want to see.

If you're looking for a single metric that tells you whether a query plan is reasonable, I'd use the amount of logical I/O. Logical I/O is correlated pretty well with actual query performance and with the amount of resources your query consumes. Looking at execution time can be problematic because it varies significantly based on what data happens to be cached (which is why queries often run much faster the second time they're executed) while logical I/O doesn't change based on what data is in cache. It also lets you scale your expectations as the number of rows your queries need to process change. If you're writing a query that needs to aggregate data from 1 million rows, for example, that should consume far more resources than a query that needs to return 100 rows of data from a table with no aggregation. If you're looking at logical I/O, you can easily scale your expectations to the size of the problem to figure out how efficient your queries could realistically be.

In Christian Antognini's "Troubleshooting Oracle Performance" (page 450), for example, he gives a rule of thumb that is pretty reasonable

5 logical reads per row that is returned/ aggregated is probably very good
10 logical reads per row that is returned/ aggregated is probably adequate
20+ logical reads per row that is returned/ aggregated is probably inefficient and needs to be tuned

Different systems with different data models may merit tweaking the buckets a bit but those are likely to be good starting points.

My guess is that if you're researchers that are not developers, you're probably running queries that need to aggregate or fetch relatively large data sets, at least in comparison to those that application developers are commonly writing. If you're scanning a million rows of data to generate some aggregate results, your queries are naturally going to consume far more resources than an application developer whose queries are reading or writing a handful of rows. You may be writing queries that are just as efficient from a logical I/O per row perspective, you just may be looking at many more rows.

If you are running queries against the live production database, you may well be in a situation where it makes sense to start segregating workload. Most organizations reach a point where running reporting queries against the live database starts to create issues for the production system. One common solution to this sort of problem is to create a separate reporting database that is fed from the production system (either via a nightly snapshot or by an ongoing replication process) where reporting queries can run without disturbing the production application. Another common solution is to use something like Oracle Resource Manager to limit the amount of resources available to one group of users (in this case, report developers) in order to minimize the impact on higher priority users (in this case, users of the production system).

answered Nov 10 '22 15:11

Justin Cave

Related questions
                            
                                Does long query string affect the speed?
                            
                                SQL Server Login Disable windows authentication
                            
                                Postgres geolocation points Distance
                            
                                Hibernate Query using Enum as Parameter
                            
                                First and last value aggregate functions in postgresql that work correctly with NULL values
                            
                                Why is my complete table is locked instead of rows?
                            
                                Prevent Sql injection in nhibernate
                            
                                Trigger alternatives for two tables that have to mutually update each other
                            
                                Generate Insert Scripts with IF NOT EXISTS
                            
                                Grouping the records on a specific criteria and to find the maximum value
                            
                                Test if Table Exists in Linked Server (Server Name is a Parameter)
                            
                                MySQL: Count of records with consecutive months
                            
                                sql server lead - problems with date
                            
                                Describe table SQL Oracle [closed]
                            
                                FirstOrDefault call in entity framework is cached but database is changed
                            
                                Find all intersections of all sets of ranges in PostgreSQL
                            
                                How to write clean SQL Server stored procedures
                            
                                Many to many association without join table
                            
                                display count of numeric and non numeric column value in a single row in mysql
                            
                                Difference between values in consecutive rows for unique ID's

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Query Cost vs. Execution Speed + Parallelism

Tags:

sql

parallel-processing

oracle

query-optimization

anbisme

People also ask

1 Answers

Justin Cave

Recent Activity

Donate For Us