Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When does the Oracle CBO choose to execute a "merge join cartesian" operation?

From time to time, Oracle seems to prefer a MERGE JOIN CARTESIAN operation over a regular MERGE JOIN. Knowing the data and looking at concrete execution plans, I can see that this operation is usually not a problem, as one of the joined entities can return only exactly one record in the query at hand.

However, for historic reasons, our DBAs have a general distaste for cartesian products.

So I'd like to better analyse those cases and be backed up with documentation in my argumentation. Is there any official Oracle documentation about query transformation and the CBO where I can understand the cases when Oracle prefers the MERGE JOIN CARTESIAN (or similar) operation?

In this case, I'm Using Oracle 11g (11.2.0.2.0)

UPDATE:

These are similar questions, but they don't explain why or when Oracle prefers the MJC over a regular MERGE JOIN:

  • Why does this query result in a MERGE JOIN CARTESIAN in Oracle?
  • Why would this query cause a Merge Cartesian Join in Oracle
like image 931
Lukas Eder Avatar asked Nov 16 '11 16:11

Lukas Eder


People also ask

What is merge join Cartesian in oracle?

A Cartesian join is basically a sort-merge join, and it shows up as MERGE JOIN CARTESIAN in the execution plan. It is Oracle's fall-back plan: if there is no join predicate, then it has no alternative as every row in the driving row source matches each and every row in the probe row source.

How avoid merge Cartesian join in Oracle?

Tip: For testing for unnecessary Cartesian Merge Joins, try adding the RULE hint to your SQL. If you have a problem with an unnecessary Cartesian join, the RULE hint will cause the explain plan to replace the cartesian merge join with a nested loops or hash join.

How avoid Cartesian join in SQL?

To avoid a Cartesian product, you must specify how the tables should be combined. Typically, you want to pair rows based on matching values in one or more key columns of each table.

Why is it called a Cartesian join?

The Cartesian product is named after René Descartes, whose formulation of analytic geometry gave rise to the concept, which is further generalized in terms of direct product.


1 Answers

Yes, the mention of cartesian joins usually makes a DBA's heart skip a beat. The cartesian joins caused by missing join conditions are definitely a pain to deal with - these are the types of joins that can "blow up" temp space and cause all types of alarms to go off.

I didn't find anything in Oracle's official 11g documentation on this particular join method but I did find plenty of articles about issues with it in their support DB. I have chased down a few of these in the past couple of weeks and here is what I found.

The source of the MJC is a CBO optimization. MJC is an optimization that works great when the cardinality of the result set being joined is low. The problem occurs when the Optimizer is not correctly estimating the cardinality of one or more of the result sets that are inputs to the join. If the estimated rows = 1 (or is a low number) but the actual rows for the result set is large then the optimizer may still choose a MJC resulting in a sub optimal plan. And that's an understatement. I have had issues with this happening and queries running for days and not finishing. After getting the CBO back on track they have run in seconds instead of hours or days.

The best way to find out if this Estimated Rows vs Actual Rows is the case is to run the query and view its execution plan statistics. You mentioned you are on 11g - use the SQL Monitoring feature. The output of this feature will show you how much time was spent on each step of your execution plan. It will also show you Estimated Rows vs Actual rows. You are looking for large discrepancies in Estimated Rows vs Actual Rows on the inputs for the MJC.

SQL Monitoring is available through OEM/DB Control, or you can use the API (search for DBMS_SQLTUNE.REPORT_SQL_MONITOR). The same sorts of info can be gathered using the GATHER_PLAN_STATISTICS hint with the query and then generating a report with DBMS_XPLAN... details are here to do that.

So how to get rid of it? Try to resolve object statistics issues. Once the CBO knows it is really dealing with hundreds, thousands, or millions of records as inputs to the join instead of '1' it should choose a join method more appropriate for the data set and not choose MJC. Easier said than done, books have been written on this topic but at least check out the basics - make sure all tables involved in the query at least have statistics. It may be possible to leverage Supplemental Statistics as well if you have multi-column expressions being applied in your where clause.

If you need a big hammer there are some hidden parameters that allow/disallow use of MJC. They can be implemented at a database level, at a session level, or query level (using hints). I'll leave the parameters names out as an exercise for the reader as Oracle's official stance is they should only be used under the direction of Support. Don't tell them but I have had some success eliminating MJC on the query level with an OPT_PARAM hint after attempts to get object statistics to cooperate failed.

like image 81
David Mann Avatar answered Sep 28 '22 02:09

David Mann