For any two sequences a, b, where a = [a1,a2,...,an] and b = [b1,b2,...,bn] (0<=ai, bi<=m), I want to find an integer function f that f(a) = f(b) if and only if a, b have the same elements, without concerning about their orders.For example, if a = [1,1,2,3], b = [2,1,3,1], c = [3,2,1,3], then f(a) = f(b), f(a) ≠ f(b).
I know there is a naive algorithm which first sort the sequence and then map it to an integer. For example, after sorting, we have a = [1,1,2,3], b = [1,1,2,3], c = [1,2,3,3], and suppose that m = 9, using decimal conversion, we will finally have f(a) = f(b) = 1123 ≠ f(c) = 1233. But this will take O(nlog(n)) time using some kind of sorting algorithm (Don't use non comparison sorting algorithms).
Is there a better approach ? Something like hash? An O(n) algorithm?
Note that I also need the function easy to be inversed, which means that we can map an integer back to a sequence (or a set, more concisely).
Update: Forgive my poor description. Here both m, n can be very large (1 million or larger). And I also want the upper bound of f to be quite small, preferably O(m^n).
This works for sufficiently small values of m, and sufficiently small array sizes:
#include <stdio.h>
unsigned primes [] = { 2,3,5,7,11,13,17, 19, 23, 29};
unsigned value(unsigned array[], unsigned count);
int main(void)
{
unsigned one[] = { 1,2,2,3,5};
unsigned two[] = { 2,3,1,5,2};
unsigned val1, val2;
val1 = value(one, 5);
val2 = value(two, 5);
fprintf(stdout, "Val1=%u, Val2=%u\n", val1, val2 );
return 0;
}
unsigned value(unsigned array[], unsigned count)
{
unsigned val, idx;
val = 1;
for (idx = 0; idx < count; idx++) {
val *= primes [ array[idx]];
}
return val;
}
For an explanation, see my description here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With