I have, in Wolfram Mathematica 8.0, a nested list like
nList = {{a,b},{f,g},{n,o}}
and a normal list like
lList = {a,b,c,k,m,n,o,z}
and i want to check if all the sublists in nList are in lList (in the example a,b and n,o are there but not f,g)
I've done it using For[,,,]
and using index... can someone enlighten me in using functions like Map/Thread/Select to do it in one pass.
Edit: If nList
contains a,b
, lList
must contain a,b
and not a,c,b
or b,a
or b,c,a
Assuming that you don't care about element ordering, here is one way:
In[20]:= Complement[Flatten[nList],lList] ==={}
Out[20]= False
EDIT
If the order matters, then here is one way:
In[29]:= And@@(MatchQ[lList,{___,PatternSequence[##],___}]&@@@nList)
Out[29]= False
For large number of sub-lists, this may be faster:
In[34]:=
Union[ReplaceList[lList,
{___,x:Alternatives@@(PatternSequence@@@nList),___}:>{x}]]===Union[nList]
Out[34]= False
This works as follows: ReplaceList is a very nice but often ignored command which returns a list of all possible expressions which could be obtained with the pattern-matcher trying to apply the rules in all possible ways to an expression. This is in contrast with the way the pattern-matcher usually works, where it stops upon the first successful match. The PatternSequence is a relatively new addition to the Mathematica pattern language, which allows us to give an identity to a given sequence of expression, treating it as a pattern. This allowed us to construct the alternative pattern, so the resulting pattern is saying: the sequence of any sublist in any place in the main list is collected and put back to list braces, forming back the sublist. We get as many newly formed sublists as there are sequences of the original sublists in the larger list. If all sublists are present, then Union
on the resulting list should be the same as Union
of the original sublist list.
Here are the benchmarks (I took a list of integers, and overlapping sublists generated by Partition
):
In[39]:= tst = Range[1000];
In[41]:= sub = Partition[tst, 2, 1];
In[43]:=
And @@ (MatchQ[tst, {___, PatternSequence[##], ___}] & @@@ sub) // Timing
Out[43]= {3.094, True}
In[45]:=
Union[ReplaceList[tst, {___,x : Alternatives @@ (PatternSequence @@@ sub), ___}
:> {x}]] === Union[sub] // Timing
Out[45]= {0.11, True}
Conceptually, the reason why the second method is faster is that it does its work in the single run through the list (performed internally by ReplaceList
), while the first solution explicitly iterates through the big list for each sub-list.
EDIT 2 - Performance
If performance is really an issue, then the following code is yet much faster:
And @@ (With[{part = Partition[lList, Length[#[[1]]], 1]},
Complement[#, part] === {}] & /@SplitBy[SortBy[nList, Length], Length])
For example, on our benchmarks:
In[54]:= And@@(With[{part = Partition[tst,Length[#[[1]]],1]},
Complement[#,part]==={}]&/@SplitBy[SortBy[sub,Length],Length])//Timing
Out[54]= {0.,True}
EDIT 3
Per suggestion of @Mr.Wizard, the following performance improvement can be made:
Scan[
If[With[{part = Partition[lList, Length[#[[1]]], 1]},
Complement[#, part] =!= {}], Return[False]] &,
SplitBy[SortBy[nList, Length], Length]
] === Null
Here, the as soon as we get a negative answer from sub-lists of a given length, sublists of other lengths will not be checked, since we already know that the answer is negative (False
). If Scan
completes without Return
, it will return Null
, which will mean that lList
contains all of the sublists in nList
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With