Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does numeric comparison on Oracle VARCHAR column work?

I have a table where two columns are of type VARCHAR2(3BYTE) and VARCHAR2(32BYTE). When I do a select query (where col1=10 and where col1='10') or (where col2=70001 or col2='70001') the number of records fetched are the same in each set of where clauses. How does this happen? How does Oracle treat string literals and numeric constants and compare to the data despite column data-type?

But this does not work for a column of type VARCHAR2(128BYTE). The query needed to be where col3='55555555001' to work and where col3=55555555001 throws ORA-01722 error.

like image 307
James Jithin Avatar asked Dec 06 '22 09:12

James Jithin


1 Answers

As noted in the SQL Language Reference:

  • During SELECT FROM operations, Oracle converts the data from the column to the type of the target variable.
  • ...
  • When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.

Implicit conversion is performed on the table column when the types don't match. This can be seen by tracing in SQL*Plus, with some dummy data.

create table t42 (foo varchar2(3 byte));
insert into t42 (foo) values ('10');
insert into t42 (foo) values ('2A');
set autotrace on explain

This works:

select * from t42 where foo = '10';

FOO
---
10

Execution Plan
----------------------------------------------------------
Plan hash value: 3843907281

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     3 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T42  |     1 |     3 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("FOO"='10')

Note
-----
   - dynamic sampling used for this statement (level=2)

But this errors:

select * from t42 where foo = 10;

ERROR:
ORA-01722: invalid number



Execution Plan
----------------------------------------------------------
Plan hash value: 3843907281

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     3 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T42  |     1 |     3 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER("FOO")=10)

Note the difference in the filter; filter("FOO"='10') versus filter(TO_NUMBER("FOO")=10). In the latter case, comparing against a number, a to_number() is being performed against every row in the table the the result of that conversion is compared against the fixed value. So if any of the character values cannot be converted, you'll get an ORA-01722. The function being applied will also stop an index being used, if one is present on that column.

Where it gets interesting is if you have more than one filter. Oracle may evaluate them in different orders at different times, so you might not always see the ORA-01722, and it'll pop up sometimes. Say you had where foo = 10 and bar = 'X'. If Oracle thought it could filter out the non-X values first, it would only apply the to_number() to what's left, and that smaller sample might not have non-numeric values in foo. But if you has and bar = 'Y', the non-Y values might include non-numerics, or Oracle might filter on foo first, depending on how selective it thinks the values are.

The moral is to never store numeric information as a character type.


I was looking for an AskTom reference to back up the moral, and the first one I looked at conveniently refers to the effect of "a change in the order of a predicate" as well as saying "don't store numbers in varchar2's".

like image 121
Alex Poole Avatar answered Mar 16 '23 00:03

Alex Poole