I am working with an Oracle database and I need to be able to partition the data in a table. I understand that Rracle has an ora_hash function that can partition the data into buckets. Is the ora_hash function deterministic?
In my program I will be making several different database queries with each query asking for a different bucket number.
For example, in one query I might ask for the first two buckets:
SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (0,1);
In a subsequent query I might ask for the 2nd and 3rd bucket:
SELECT * FROM sales WHERE ORA_HASH(cust_id, 9) in (1,2);
In the above example, will ora_hash always divide the table into the exact same 10 buckets? Assume that the data in the tables hasn't changed. Will the second bucket (bucket 1), be identical in both queries?
There is documentation that suggests that seed value enables oracle to return different results for the same data set. So I am assuming that if I don't use seed value, then ora_hash will be deterministic. See the documentation.
ORA_HASH is a function that computes a hash value for a given expression. This function is useful for operations such as analyzing a subset of data and generating a random sample. The expr argument determines the data for which you want Oracle Database to compute a hash value.
A well-known hash function in Oracle is the SQL function ORA_HASH. This is a useful function to distribute data to multiple subsets, but the generated hash keys are far away from uniqueness. Many people are impressed by the maximum number of buckets (i.e. the number of possible return values) of this hash function.
STANDARD_HASH computes a hash value for a given expression using one of several hash algorithms that are defined and standardized by the National Institute of Standards and Technology.
ORA_HASH
is definitely deterministic for data types that can be used for partitioning, such as NUMBER, VARCHAR, DATE, etc.
But ORA_HASH
is not deterministic for at least some of the other data types, such as CLOB.
My answer is based on this Jonathan Lewis article about ORA_HASH
.
Jonathan Lewis doesn't explicitly say they are deterministic, but he does mention that ORA_HASH
"seems to be the function used internally – with a zero seed – to determine which partition a row belongs to in a hash partitioned table". And if it's used for hash partitioning then it must be deterministic, or else partition-wise joins wouldn't work.
To show that ORA_HASH
can be non-deterministic for some data types, run the below query. It's from a comment in the same article:
with src as (select to_clob('42') val from dual connect by level<=5)
select val,ora_hash(val,7) from src order by 2;
Surprisingly, this same issues happens with dbms_sqlhash.gethash
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With