Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is ora_hash deterministic?

I am working with an Oracle database and I need to be able to partition the data in a table. I understand that Rracle has an ora_hash function that can partition the data into buckets. Is the ora_hash function deterministic?

In my program I will be making several different database queries with each query asking for a different bucket number.

For example, in one query I might ask for the first two buckets:

SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (0,1);

In a subsequent query I might ask for the 2nd and 3rd bucket:

SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (1,2);

In the above example, will ora_hash always divide the table into the exact same 10 buckets? Assume that the data in the tables hasn't changed. Will the second bucket (bucket 1), be identical in both queries?

There is documentation that suggests that seed value enables oracle to return different results for the same data set. So I am assuming that if I don't use seed value, then ora_hash will be deterministic. See the documentation.

like image 283
timmy Avatar asked Feb 26 '12 03:02

timmy


People also ask

What is ORA_HASH?

ORA_HASH is a function that computes a hash value for a given expression. This function is useful for operations such as analyzing a subset of data and generating a random sample. The expr argument determines the data for which you want Oracle Database to compute a hash value.

Is ORA_HASH unique?

A well-known hash function in Oracle is the SQL function ORA_HASH. This is a useful function to distribute data to multiple subsets, but the generated hash keys are far away from uniqueness. Many people are impressed by the maximum number of buckets (i.e. the number of possible return values) of this hash function.

What is standard hash in Oracle?

STANDARD_HASH computes a hash value for a given expression using one of several hash algorithms that are defined and standardized by the National Institute of Standards and Technology.


1 Answers

ORA_HASH is definitely deterministic for data types that can be used for partitioning, such as NUMBER, VARCHAR, DATE, etc.

But ORA_HASH is not deterministic for at least some of the other data types, such as CLOB.


My answer is based on this Jonathan Lewis article about ORA_HASH.

Jonathan Lewis doesn't explicitly say they are deterministic, but he does mention that ORA_HASH "seems to be the function used internally – with a zero seed – to determine which partition a row belongs to in a hash partitioned table". And if it's used for hash partitioning then it must be deterministic, or else partition-wise joins wouldn't work.

To show that ORA_HASH can be non-deterministic for some data types, run the below query. It's from a comment in the same article:

with src as (select to_clob('42') val from dual connect by level<=5)
select val,ora_hash(val,7) from src order by 2;

Surprisingly, this same issues happens with dbms_sqlhash.gethash.

like image 196
Jon Heller Avatar answered Oct 19 '22 21:10

Jon Heller