Oracle SQL execution plan changes due to SYS_OP_C2C internal conversion

Tags:

I'm wondering why cost of this query

select * from address a
left join name n on n.adress_id=a.id
where a.street='01';

is higher than

select * from address a
left join name n on n.adress_id=a.id
where a.street=N'01';

where address table looks like this

ID              NUMBER
STREET          VARCHAR2(255 CHAR)
POSTAL_CODE     VARCHAR2(255 CHAR)

and name table looks like this

ID              NUMBER
ADDRESS_ID      NUMBER
NAME            VARCHAR2(255 CHAR)
SURNAME         VARCHAR2(255 CHAR)

These are costs returned by explain plan

Explain plan for '01'

-----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |  3591 |  1595K|    87   (0)| 00:00:02 |
|   1 |  NESTED LOOPS OUTER          |                      |  3591 |  1595K|    87   (0)| 00:00:02 |
|*  2 |   TABLE ACCESS FULL          | ADDRESS              |     3 |   207 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| NAME                 |  1157 |   436K|    47   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | NAME_HSI             |  1157 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."STREET"='01')
   4 - access("N"."ADDRESS_ID"(+)="A"."ID")

Explain plan for N'01'

-----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |   347 |   154K|    50   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER          |                      |   347 |   154K|    50   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL          | ADDRESS              |     1 |    69 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| NAME                 |  1157 |   436K|    47   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | NAME_HSI             |  1157 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(SYS_OP_C2C("A"."STREET")=U'01')
   4 - access("N"."ADDRESS_ID"(+)="A"."ID")

As you can see cost for N'01' query is lower than cost for '01'. Any idea why? N'01' needs additionally convert varchar to nvarchar so cost should be higher (SYS_OP_C2C()). The other question is why rows processed by N'01' query is lower than '01'?

[EDIT]

Table address has 30 rows.
Table name has 19669 rows.

254

asked Oct 10 '14 09:10

pepuch

2 Answers

SYS_OP_C2C is an internal function which does an implicit conversion of varchar2 to national character set using TO_NCHAR function. Thus, the filter completely changes as compared to the filter using normal comparison.

I am not sure about the reason why the number of rows are less, but I can guarantee it could be more too. Cost estimation won't be affected.

Let's try to see step-by-step in a test case.

SQL> CREATE TABLE t AS SELECT 'a'||LEVEL col FROM dual CONNECT BY LEVEL < 1000;

Table created.

SQL>
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE col = 'a10';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     5 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |     5 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

   1 - filter("COL"='a10')

13 rows selected.

SQL>

So far so good. Since there is only one row with value as 'a10', optimizer estimated one row.

Let's see with the national characterset conversion.

SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE col = N'a10';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |    50 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |    10 |    50 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

   1 - filter(SYS_OP_C2C("COL")=U'a10')

13 rows selected.

SQL>

What happened here? We can see filter(SYS_OP_C2C("COL")=U'a10'), which means an internal function is applied and it converts the varchar2 value to nvarchar2. The filter now found 10 rows.

This will also suppress any index usage, since now a function is applied on the column. We can tune it by creating a function-based index to avoid full table scan.

SQL> create index nchar_indx on t(to_nchar(col));

Index created.

SQL>
SQL> EXPLAIN PLAN FOR SELECT * FROM t WHERE to_nchar(col) = N'a10';

Explained.

SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1400144832

--------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |            |    10 |    50 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T          |    10 |    50 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | NCHAR_INDX |     4 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------

   2 - access(SYS_OP_C2C("COL")=U'a10')

14 rows selected.

SQL>

However, will this make the execution plans similar? No. i think with two different charactersets , the filter will not be applied alike. Thus, the difference lies.

My research says,

Usually, such scenarios occur when the data coming via an application is nvarchar2 type, but the table column is varchar2. Thus, Oracle applies an internal function in the filter operation. My suggestion is, to know your data well, so that you use similar data types during design phase.

answered Sep 16 '22 13:09

Lalit Kumar B

When worrying about explain plans, it matters whether there are current statistics on the tables. If the statistics do not represent the actual data reasonably well, then the optimizer will make mistakes and estimate cardinalities incorrectly.

You can check how long ago statistics were gathered by querying the data dictionary:

select table_name, last_analyzed
  from user_tables
 where table_name in ('ADDRESS','NAME');

You can gather statistics for the optimizer to use by calling DBMS_STATS:

begin
   dbms_stats.gather_table_stats(user, 'ADDRESS');
   dbms_stats.gather_table_stats(user, 'NAME');
end;

So perhaps after gathering statistics you will get different explain plans. Perhaps not.

The difference in your explain plans is primarily because the optimizer estimates how many rows it will find in address table differently in the two cases.

In the first case you have an equality predicate with same datatype - this is good and the optimizer can often estimate cardinality (row count) reasonably well for cases like this.

In the second case a function is applied to the column - this is often bad (unless you have function based indexes) and will force the optimizer to take a wild guess. That wild quess will be different in different versions of Oracle as the developers of the optimizer tries to improve upon it. Some versions the wild guess will simply be something like "I guess 5% of the number of rows in the table."

When comparing different datatypes, it is best to avoid implicit conversions, particularly when like this case the implicit conversion makes a function on the column rather than the literal. If you have cases where you get a value as datatype NVARCHAR2 and need to use it in a predicate like above, it can be a good idea to explicitly convert the value to the datatype of the column.

select * from address a
left join name n on n.adress_id=a.id
where a.street = CAST( N'01' AS VARCHAR2(255));

In this case with a literal it does not make sense, of course. Here you would just use your first query. But if it was a variable or function parameter, maybe you could have use cases for doing something like this.

answered Sep 17 '22 13:09

Kim Berg Hansen

Related questions
                            
                                Run SQL query on MySQL service startup
                            
                                Atomically set SERIAL value when committing transaction
                            
                                refresh materialized view periodically postgres
                            
                                How can I pass array as a sql query param for cosmos DB query
                            
                                How to use Inno Setup to update a database using .sql script
                            
                                Keeping Track of User Points (Like SO)
                            
                                Public SQL database for educational purposes
                            
                                How to generate a virtual table to generate a sequence of dates in PostgreSQL?
                            
                                Speeding up a group by date query on a big table in postgres
                            
                                Managing hierarchies in SQL: MPTT/nested sets vs adjacency lists vs storing paths
                            
                                Is the PDO Library faster than the native MySQL Functions?
                            
                                How to concatenate % and column name in a LIKE statement
                            
                                Can I get the results of a stored procedure into a cursor within another stored procedure in SQL
                            
                                MS SQL Server pivot table with subquery in column clause
                            
                                SQL remove autonamed constraints
                            
                                Remove duplicate fields from a temp table that has no primary key [duplicate]
                            
                                Retrieving timestamp of MySQL row creation from meta data?
                            
                                How to create a table inside of a procedure in Oracle?
                            
                                equivalent of PostgreSQL type "timestamp without time zone" in SQL Server
                            
                                SQL one to many with inner join

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Oracle SQL execution plan changes due to SYS_OP_C2C internal conversion

Tags:

sql

oracle

oracle11g

pepuch

People also ask

2 Answers

Lalit Kumar B

Kim Berg Hansen

Recent Activity

Donate For Us