Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

v$sql_plan_monitor - Terribly inaccurate JOIN estimate?

I'm running into a bit of an odd phenomenon when it comes to the v$sql_plan_monitor table on Oracle 11.2.

I have two decently sized tables. One has about 25 million rows, the other about 35 million rows, both are ~99% unique with just an odd handful of duplicate records.

The explain plan is as follows (table names substituted for privacy, tables had statistics gathered immediately before explain plan):

--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                      |       |       | 65611 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE            |                      |     1 |    34 |            |          |        |      |            |
|   2 |   PX COORDINATOR           |                      |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)     | :TQ10002             |     1 |    34 |            |          |  Q1,02 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE         |                      |     1 |    34 |            |          |  Q1,02 | PCWP |            |
|*  5 |      FILTER                |                      |       |       |            |          |  Q1,02 | PCWC |            |
|*  6 |       HASH JOIN OUTER      |                      |   234K|  7770K| 65611   (1)| 00:19:41 |  Q1,02 | PCWP |            |
|   7 |        PX RECEIVE          |                      |    23M|   513M| 26409   (1)| 00:07:56 |  Q1,02 | PCWP |            |
|   8 |         PX SEND HASH       | :TQ10000             |    23M|   513M| 26409   (1)| 00:07:56 |  Q1,00 | P->P | HASH       |
|   9 |          PX BLOCK ITERATOR |                      |    23M|   513M| 26409   (1)| 00:07:56 |  Q1,00 | PCWC |            |
|* 10 |           TABLE ACCESS FULL| PRETTY_BIG_TABLE     |    23M|   513M| 26409   (1)| 00:07:56 |  Q1,00 | PCWP |            |
|  11 |        PX RECEIVE          |                      |    36M|   384M| 39164   (1)| 00:11:45 |  Q1,02 | PCWP |            |
|  12 |         PX SEND HASH       | :TQ10001             |    36M|   384M| 39164   (1)| 00:11:45 |  Q1,01 | P->P | HASH       |
|  13 |          PX BLOCK ITERATOR |                      |    36M|   384M| 39164   (1)| 00:11:45 |  Q1,01 | PCWC |            |
|* 14 |           TABLE ACCESS FULL| EVEN_BIGGER_TABLE    |    36M|   384M| 39164   (1)| 00:11:45 |  Q1,01 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------

The number that's giving me some grief is the Rows value for the HASH JOIN OUTER step.
Oracle estimates it will output around 234k rows, a relatively small amount. I know for a fact that the query will return around 50k rows after filtering* the results as it's been run before with the same data for testing purposes.

*: The actual query itself is an anti-join using a LEFT JOIN and a WHERE to filter the NULL records.

However, once the query is running, I check its sql_id in the v$sql_plan_monitor table:

  1  SELECT
  2      plan_line_id,
  3      plan_operation,
  4      ROUND(MAX(plan_cardinality) / 1000) AS est_krows,
  5      ROUND(SUM(output_rows) / 1000) AS actual_krows
  6  FROM v$sql_plan_monitor
  7  WHERE sql_id = 'sql_id_goes_here'
  8  GROUP BY sql_id, sql_exec_id, sql_exec_start, plan_line_id, plan_operation
  9* ORDER BY sql_exec_id, plan_line_id
SQL> /

PLAN_LINE_ID PLAN_OPERATION                  EST_KROWS ACTUAL_KROWS
------------ ------------------------------ ---------- ------------
           0 SELECT STATEMENT                                     0
           1 SORT                                    0            0
           2 PX COORDINATOR                                       0
           3 PX SEND                                 0            0
           4 SORT                                    0            0
           5 FILTER                                               0
           6 HASH JOIN                             234     23084866
           7 PX RECEIVE                          23402        23168
           8 PX SEND                             23402        23168
           9 PX BLOCK                            23402        23168
          10 TABLE ACCESS                        23402        23168
          11 PX RECEIVE                          36699        17772
          12 PX SEND                             36699        17748
          13 PX BLOCK                            36699        17748
          14 TABLE ACCESS                        36699        17748

Mind you, the query is still in progress, so the actual_krows value is growing.

So my questions are:

  1. Why is the actual row count from the HASH JOIN off by five orders of magnitude?
  2. What can I do to either have a more accurate estimate or a more accurate reading of the actual row output progress? It only seems to be giving me terrible results on JOINs, nothing else.
like image 969
Mr. Llama Avatar asked Sep 30 '22 05:09

Mr. Llama


1 Answers

Why is the estimate wrong?

Because, theoretically, it is impossible to predict if a program will ever finish, much less predict how long it will take. And, practically, estimates are difficult, and Oracle only has time for satisficing; Oracle doesn't know if the query will be submitted once per day or a thousand times per second, and can't afford to spend a lot of time deciding.

How can we improve the estimate?

It might help to see the whole query and have some information about the table structures and data distribution. That's a lot of information and there's no guarantee it will help. Instead, here are a bunch of methods that may be useful for adjusting the cardinality. Depending on your query, session, environment, etc., not all of them will be helpful.

  1. ASSOCIATE STATISTICS It's hard enough to estimate declarative code, Oracle doesn't bother trying too hard when procedural code is used. If there is a custom function, the default estimate will be poor. But you can specify a custom selectivity to change the estimates. In some rare cases it may be worth replacing a complicated SQL expression with a function with associated statistics.
  2. Fake statistics DBMS_STATS.SET_COLUMN_STATS and other functions allow you to change the input into the estimation algorithms. But be careful that your fix for this one query doesn't break other queries that have perfectly reasonable estimates.
  3. Extended statistics As ibre5041 alluded to, column groups or expressions can be difficult to estimate. Instead, you can have Oracle gather statistics on those groups and expressions. Then, when they are used in a query, the estimate may be much better.
  4. Rewrite conditions Some types of expressions are harder to estimate than others. Try to re-factor your expressions if possible. For example, some complicated NVL expressions can sometimes be better written with an OR.
  5. SQL Profiles "A SQL profile is a database object that contains auxiliary statistics specific to a SQL statement." For example, the table statistics may imply only 10% of the rows join, and the profile may say "multiply that by 1000".
  6. Undocumented hints The OPT_ESTIMATE and CARDINALITY hint can help make up for bad estimates. OPT_ESTIMATE is what SQL profiles use, and are a good way to say "hey, increase the cardinality by 1000% percent". CARDINALITY is an easy way to say "this whole query is gonna return X rows." But these hints are difficult to use.
  7. Dynamic Sampling A hint like `/*+ dynamic_sampling(4) */ is a way of telling the optimizer "this is an expensive query, take the time to read the existing data, try it out, and adjust the numbers". At least, that's the theory. In practice it's not always very helpful.
  8. Cardinality feedback Run the statement twice, if the cardinality is significantly wrong Oracle may fix it the second time around.
  9. Adaptive Query Optimization 12c introduced a feature where execution plans will occasionally check the number of rows, and fix themselves if the estimate is wrong. This doesn't fix the root cause. And it's not available to you yet. But it sounds cool and may be a good reason to start thinking about an upgrade.

Do we even need to fix the estimate?

It's wise to be concerned with cardinality. Bad cardinality estimates are responsible for many performance problems. But in many cases the cardinality can be wrong by orders of magnitude and not matter.

I don't see any obvious problems with the execution plan. Two large tables are accessed the right way (full table scan is better if most of the rows will be used), the join method is good (hash join is best for many rows), the join order is good (the large table is hashed (i.e. the first table), the larger table is probed (i.e. the second table)), and parallelism is good (parallelism is used in each step, there are no broadcasts of huge row sources, etc).

If that execution plan is the whole story, I'd call it a success.

5 orders of magnitude off doesn't matter sometimes, especially when the mistake is near the end of the execution plan. And 234K is a large enough number to stop a lot of mistakes, like a bad cross join.

However, if this is only part of a larger query or a view, then the resulting cardinality may affect other execution plans.

like image 147
Jon Heller Avatar answered Oct 02 '22 15:10

Jon Heller