I am a bit at a loss how to do the following efficiently in Mathematica:
a = { 1, 2, 3, 4, 5 }; (* list of integers *)
b = { 2, 4, 6, 8 }; (* another list of integers *)
filter = Table[MemberQ[b, element], {element,a}]
Expected output is:
{False, True, False, True, False}
My lists a
and b
are big, so Mathematica is doing a kazillion linear searches through b
. I want it to do faster lookups with a hashtable. But there seems to be no such structure. The closest I could find is a SparseArray, but
sa = SparseArray[{1 -> True, 2 -> True}];
MemberQ[sa, 1]
is False
.
I'm sure this must be possible in Mathematica in one line of code or less, I just can't see it for the trees, or something.
Any hero to the rescue? Meanwhile, I'm going to do this with C#.
The following question is equivalent to yours and contains an answer in Mathematica:
https://stackoverflow.com/questions/1954636/given-list-subset-of-list-return-bitmask
Here's that answer (which I'll feel free to steal since it's actually mine):
bitmask[a_, b_] := Module[{f},
f[_] = False;
(f[#] = True)& /@ b;
f /@ a]
The idea is to make a hash table, f, that can answer in constant time whether a given element is a member of list b. First f is given a default value of False (if we haven't seen it before it's not in b). Then a single pass through the list b is made, setting f[i] to True for each i in b. That's all the set up. Now a single pass of the hash function f over the list a gives us the answer. Total time is O(n+m) -- one pass through each list.
Another idea would be to use rules and Dispatch
. It seems to be faster when the length of the blist is large:
alist = Prime /@ Range[20000];
blist = alist + 2;
membQ = First@Timing[MemberQ[blist,#]&/@alist;]
sparse = First@Timing[
s = SparseArray[blist -> ConstantArray[True, Length@blist], Max[alist, blist], False];
s[[#]] & /@ alist;
]
rules = First@Timing[
bRules = Dispatch[Append[Rule[#, True] & /@ blist, _ -> False]];
(# /. bRules) & /@ alist;
]
fct = First@Timing[
f[_] = False;
(f[#] = True) & /@ blist;
f /@ alist;
]
On my laptop, rules (0.06s) < fct (0.09s) < sparse (0.9s) < membQ (40s).
Edit / Comparing fct and rules timings
@Karsten please feel free to rollback to your original answer
Tables generated with Prime[...]
Tables generated with RandomInteger[...]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With