Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if a value belongs to a hash

Tags:

algorithm

I'm not sure if this is actually possible thus I ask here. Does anyone knows of an algorithm that would allow something like this?

const values = ['a', 'b', 'c', 'd'];
const hash = createHash(values); // => xjaks14sdffdghj23h4kjhgd9f81nkjrsdfg9aiojd
hash.includes('b'); // => true
hash.includes('v'); // => false

What this snippet does, is it first creates some sort of hash from a list of values, then checks if the certain value belongs to that hash.

like image 899
xpepermint Avatar asked Jan 20 '26 03:01

xpepermint


2 Answers

Hash functions in general

The primary idea of hash functions is to reduce the space, that is the functions are not injective as they map from a bigger domain to a smaller.

enter image description here

So they produce collisions. That is, there are different elements x and y that get mapped to the same hash value:

h(x) = h(y)

So basically you loose information of the given argument x.

However, in order to answer the question whether all values are contained you would need to keep all information (or at least all non-duplicates). This is obviously not possible for nearly all practical hash-functions.

Possible hash-functions would be identity function:

h(x) = x for all x

but this doesn't reduce the space, not practical.

A natural idea would be to compute hash values of the individual elements and then concatenate them, like

h(a, b, c) = (h(a), h(b), h(c))

But this again doesn't reduce the space, hash values are as long as the message, not practical.

Another possibility is to drop all duplicates, so given values [a, b, c, a, b] we only keep [a, b, c]. But this, in most examples, only reduces the space marginally, again not practical.

But no matter what you do, you can not reduce more than the amount of non-duplicates. Else you wouldn't be able to answer the question for some values. For example if we use [a, b, c, a] but only keep [a, b], we are unable to answer "was c contained" correctly.


Perfect hash functions

However, there is the field of perfect hash functions (Wikipedia). Those are hash-functions that are injective, they don't produce collisions.

In some areas they are of interest.

For those you may be able to answer that question, for example if computing the inverse is easy.


Cryptographic hash functions

If you talk about cryptographic hash functions, the answer is no.

Those need to have three properties (Wikipedia):

  • Pre-image resistance - Given h it should be difficult to find m : hash(m) = h
  • Second pre-image resistance - Given m it should be difficult to find m' : hash(m) = hash(m')
  • Collision resistance - It should be difficult to find (m, m') : hash(m) = hash(m')

Informally you have especially:

A small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value.

If you now would have such a hash value you would be able to easily reconstruct it by asking whether some values are contained. Using that you can easily construct collisions on purpose and stuff like that.

Details would however depend on the specific hash algorithm.

For a toy-example let's use the previous algorithm that simply removes all duplicates:

[a, b, c, a, a] -> [a, b, c]

In that case we find messages like

[a, b, c]
[a, b, c, a]
[a, a, b, b, c]
...

that all map to the same hash value.

like image 77
Zabuzard Avatar answered Jan 23 '26 20:01

Zabuzard


If the hash function produces collisions (as almost all hash function do) this cannot be possible. Think about it this way if for example h('abc') = x and h('abd') = x, how can you decide based on x if the original string contains 'd'?

You could arguably decide to use identity as a has function, which would do the job.

like image 21
SaiBot Avatar answered Jan 23 '26 19:01

SaiBot



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!