I have a query with
STANDARD_HASH(Column1|Column2|ColumnN, 'MD5') AS HashValue
for each row in a table.
Is it possible to GROUP by ColumnX and produce a Aggregate of the hash values?
I've tried LISTAGG but unfortunately that results in too large a concatenation to hash without error (it's possible I'm not applying this correctly though).
PSEUDO ORACLE presentation of what I'd like to achieve:
SELECT ColumnX, UNKNOWN_AGGREGATE_FUNC(STANDARD_HASH(Column1|Column2|ColumnN, 'MD5'))
FROM TableY
GROUP BY ColumnX
Essentially my reasoning for this, is ideally I'd like to batch up, by ColumnX,the hash values so that I can transport the batch results over the wire for comparison with an external system and therefore not necessarily have to transport every row in the event the batches match.
Convert the hashes into numbers, sum the hashes per batch, and then compare the sums.
--drop table tableY;
create table tableY(columnX number, column1 number, column2 number, columnN number);
insert into tableY values(1, 1,1,1);
insert into tableY values(1, 2,2,2);
insert into tableY values(2, 1,1,1);
insert into tableY values(2, 2,2,2);
insert into tableY values(3, 1,1,1);
The conversion gets a bit ugly. One of the main limitations is numeric precision, which is the reason for the SUBSTR(..., 1, 15).
Substringing and summing hashes like this probably loses some important cryptographic properties but this should be good enough for comparisons.
select
columnX,
trim(
upper(
to_char(hash_sum, 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
)
) hash_sum
from
(
select columnX,
sum(
to_number(
substr(
standard_hash(column1||column2||columnN, 'MD5')
, 1, 15
)
, 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
)
) hash_sum
from tableY
group by columnX
);
COLUMNX HASH_SUM
------- --------
1 1264B85078436A7F
2 1264B85078436A7F
3 698D51A19D8A121
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With