I'm not sure if this is actually possible thus I ask here. Does anyone knows of an algorithm that would allow something like this?
const values = ['a', 'b', 'c', 'd'];
const hash = createHash(values); // => xjaks14sdffdghj23h4kjhgd9f81nkjrsdfg9aiojd
hash.includes('b'); // => true
hash.includes('v'); // => false
What this snippet does, is it first creates some sort of hash from a list of values, then checks if the certain value belongs to that hash.
The primary idea of hash functions is to reduce the space, that is the functions are not injective as they map from a bigger domain to a smaller.

So they produce collisions. That is, there are different elements x and y that get mapped to the same hash value:
h(x) = h(y)
So basically you loose information of the given argument x.
However, in order to answer the question whether all values are contained you would need to keep all information (or at least all non-duplicates). This is obviously not possible for nearly all practical hash-functions.
Possible hash-functions would be identity function:
h(x) = x for all x
but this doesn't reduce the space, not practical.
A natural idea would be to compute hash values of the individual elements and then concatenate them, like
h(a, b, c) = (h(a), h(b), h(c))
But this again doesn't reduce the space, hash values are as long as the message, not practical.
Another possibility is to drop all duplicates, so given values [a, b, c, a, b] we only keep [a, b, c]. But this, in most examples, only reduces the space marginally, again not practical.
But no matter what you do, you can not reduce more than the amount of non-duplicates. Else you wouldn't be able to answer the question for some values. For example if we use [a, b, c, a] but only keep [a, b], we are unable to answer "was c contained" correctly.
However, there is the field of perfect hash functions (Wikipedia). Those are hash-functions that are injective, they don't produce collisions.
In some areas they are of interest.
For those you may be able to answer that question, for example if computing the inverse is easy.
If you talk about cryptographic hash functions, the answer is no.
Those need to have three properties (Wikipedia):
h it should be difficult to find m : hash(m) = hm it should be difficult to find m' : hash(m) = hash(m')(m, m') : hash(m) = hash(m')Informally you have especially:
A small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value.
If you now would have such a hash value you would be able to easily reconstruct it by asking whether some values are contained. Using that you can easily construct collisions on purpose and stuff like that.
Details would however depend on the specific hash algorithm.
For a toy-example let's use the previous algorithm that simply removes all duplicates:
[a, b, c, a, a] -> [a, b, c]
In that case we find messages like
[a, b, c]
[a, b, c, a]
[a, a, b, b, c]
...
that all map to the same hash value.
If the hash function produces collisions (as almost all hash function do) this cannot be possible.
Think about it this way if for example h('abc') = x and h('abd') = x, how can you decide based on x if the original string contains 'd'?
You could arguably decide to use identity as a has function, which would do the job.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With