Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to take a list and generate all lists of increasing length?

Any easy question for the Mathematica experts here:

Given a list, say

Clear[a, b, c];
data = {a, b, c};

and I want to get back all lists of length 1,2,3,...Length[data] starting from the start to the end, so that I get the following for the above

out = {{a}, {a, b}, {a, b, c}}

I looked at the commands in M to find a ready one to use, and I could (looked at all the Map's and Nest* functions, but not that I can see how to use for this). I am sure it is there, but I am not seeing it now.

now I do this silly Do loop to build it

m=Length[data];
First@Reap[Do[Sow[data[[1;;i]]],{i,1,m}]][[2]]

{{a},{a,b},{a,b,c}}

question is: does Mathematica have a build-in command to do the above?

update 8 am

I've deleted the tests I've done an hr ago and will be reposting them again soon. I need to run them few times and take an average as that is the better way to do this performance test.

update 9 am

Ok, I've re-run the performance tests on all solutions shown below. 8 methods. For each method, I run it 5 times and took the mean. I did this for n={1000, 5000, 10000, 15000, 25000, 30000} where n is the length of the original list to process.

can't go much over 30,000, will run out of ram. I only have 4 GB ram.

I made a small function called makeTable[n, methods] which generate performance table for specific n. test code is below (written quickly so not the most clean code, not very functional, as I have to go :), but it is below and any one can change/clean it, etc... if they want

conclusion: Kguler method was the fastest, with Thies method almost the same for large n, (30,000), so for all practical purposes, may be Thies and Kguler methods can be declared as the winners for large n? But since Kguler is also fastest for small n, so far, he gets the clear edge.

Again, test code is below for any one to check and run to see if I might made an error somewhere. As correctly predicted by Leonid, the linked list method did not fare too well for large n.

I think more tests are needed, as only taking the mean of 5 might not be enough, also other considerations I might have missed. This is not an exact test, just a rough one to get an idea.

I tried not to use the pc much while running the tests. I used AbsoluteTiming[] to measure cpu.

Here is screen shot of the tables generated

enter image description here

Here is the test code:

methods = {nasser, wizard1, wizard2, wizard3, kguler, leonid1, 
   leonid2, thies};
AppendTo[$ContextPath, "Internal`"];
ClearAll[linkedList, leonid2];
SetAttributes[linkedList, HoldAllComplete];

nasser[lst_] := Module[{m = Length[lst]},
   First@Reap[Do[Sow[lst[[1 ;; i]]], {i, 1, m}]][[2]]
   ];

wizard1[lst_] := Module[{},
   Take[lst, #] & /@ Range@Length@lst
   ];

wizard2[lst_] := Module[{},
   Table[Take[#, i], {i, Length@#}] & @lst
   ];

wizard3[lst_] := Module[{},
   Rest@FoldList[Append, {}, #] & @lst
   ];

kguler[lst_] := Module[{},
   Reverse@NestList[Most, #, Length[#] - 1] & @lst

   ];

leonid1[lst_] := Module[{b = Bag[{}]},
   Map[(StuffBag[b, #]; BagPart[b, All]) &, lst]
   ];

leonid2[lst_] := Module[{},
   Map[List @@ Flatten[#, Infinity, linkedList] &, 
    FoldList[linkedList, linkedList[First@lst], Rest@lst]]
   ];

thies[lst_] := 
  Module[{}, 
   Drop[Reverse@
     FixedPointList[If[Length[#] > 0, Most, Identity][#] &, lst], 2]
   ];

makeTable[n_, methods_] := 
  Module[{nTests = Length[methods], nTries = 5, i, j, tests, lst},
   lst = Table[RandomReal[], {n}];

   tests = Table[0, {nTests}, {nTries}];

   For[i = 1, i <= nTests, i++,
    For[j = 1, j <= nTries, j++,
      tests[[i, j]] = First@AbsoluteTiming[methods[[i]][lst] ]
     ]
    ];

   tbl = Table[{ToString[methods[[i]]], Mean[ tests[[i, All]] ]}, {i, 
      nTests}] ;

   Grid[Join[{{"method", "cpu"}}, tbl],
    Frame -> All, FrameStyle -> Directive[Thickness[.005], Gray], 
    Spacings -> {0.5, 1}
    ]
   ];

Now to run, do

makeTable[1000, methods]

Warning, do not try something over 30,000 unless you have zillion GB, else M might not return. It happened to me, and had to reboot the PC.

update 12/26/11 3:30PM

I see that Thies has a newer version of this algorithm (I called it thies2 in the methods table), so I re-run everything again, here is the updated table, I removed the linked list version since it is known in advance not to be fast for large n, and this time, I run them each for 10 times (not 5 as above) and then took the mean). I also started M fresh using factory setting (restarted it holding alt-shift keys so that all setting are back to original settings just in case)

conclusion so far

Kugler is fastest for smaller n, i.e. n<20,000. For larger n, now Thies second version is faster than Thies version 1 and now it edges ahead ever so slightly ahead of Kugler method for large n. Congratulation to Thies, the current lead in this performance test. But for all practical purposes, I would say both Thies and Kugler methods are the fastest for large n, and Kugler remain the fastest for smaller n.

Below are tables and the updated test code below them. Any one is free to run the tests for themselves, just in case I might overlooked something.

enter image description here

The current test code:

$MinPrecision = $MachinePrecision;
$MaxPrecision = $MachinePrecision;
methods = {nasser, wizard1, wizard2, wizard3, kguler, leonid, thies1, 
   thies2};
AppendTo[$ContextPath, "Internal`"];

nasser[lst_] := Module[{m = Length[lst]},
   First@Reap[Do[Sow[lst[[1 ;; i]]], {i, 1, m}]][[2]]
   ];

wizard1[lst_] := Module[{},
   Take[lst, #] & /@ Range@Length@lst
   ];

wizard2[lst_] := Module[{},
   Table[Take[#, i], {i, Length@#}] & @lst
   ];

wizard3[lst_] := Module[{},
   Rest@FoldList[Append, {}, #] & @lst
   ];

kguler[lst_] := Module[{},
   Reverse@NestList[Most, #, Length[#] - 1] & @lst

   ];

leonid[lst_] := Module[{b = Bag[{}]},
   Map[(StuffBag[b, #]; BagPart[b, All]) &, lst]
   ];

thies1[lst_] := 
  Module[{}, 
   Drop[Reverse@
     FixedPointList[If[Length[#] > 0, Most, Identity][#] &, lst], 2]
   ];

thies2[lst_] := 
  Module[{}, 
   Drop[Reverse@
     FixedPointList[If[# =!= {}, Most, Identity][#] &, lst], 2]
   ];

makeTable[n_Integer, methods_List] := 
  Module[{nTests = Length[methods], nTries = 10, i, j, tests, lst},
   lst = Table[RandomReal[], {n}];

   tests = Table[0, {nTests}, {nTries}];

   For[i = 1, i <= nTests, i++,
    For[j = 1, j <= nTries, j++,
      tests[[i, j]] = First@AbsoluteTiming[methods[[i]][lst] ]
     ]
    ];

   tbl = Table[{ToString[methods[[i]]], Mean[ tests[[i, All]] ]}, {i, 
      nTests}] ;

   Grid[Join[{{"method", "cpu"}}, tbl],
    Frame -> All, FrameStyle -> Directive[Thickness[.005], Gray], 
    Spacings -> {0.5, 1}
    ]
   ];

To run type

n=1000
makeTable[n, methods]

Thanks for everyone for their answers, I learned from all of them.

like image 545
Nasser Avatar asked Dec 26 '11 04:12

Nasser


People also ask

How do I get a list of list sizes?

To get the length of a list in Python, you can use the built-in len() function. Apart from the len() function, you can also use a for loop and the length_hint() function to get the length of a list. In this article, I will show you how to get the length of a list in 3 different ways.

How do you increase the length of a list in Python?

Python list extend() is an inbuilt function that adds the specified list elements to the end of the current list. The extend() extends the list by adding all items of the list to an end. For example: list1 = [1,2,3,4] , list2 = [5,6,7]; print(list2. extend(list1)).

How do I create an n list size?

To create a list of n placeholder elements, multiply the list of a single placeholder element with n . For example, use [None] * 5 to create a list [None, None, None, None, None] with five elements None . You can then overwrite some elements with index assignments.

How do you count the length of a string in a list?

You can use the len() to get the length of the given string, array, list, tuple, dictionary, etc.


4 Answers

You can use

f = Reverse@NestList[Most, #, Length[#] - 1] &

f@{a,b,c,d,e} gives {{a}, {a, b}, {a, b, c}, {a, b, c, d}, {a, b, c, d, e}}.

An alternative using ReplaceList -- much slower than f, but ... why not?:

g = ReplaceList[#, {x__, ___} -> {x}] &
like image 121
kglr Avatar answered Dec 10 '22 15:12

kglr


I propose this:

runs[lst_] := Take[lst, #] & /@ Range@Length@lst

Or this:

runs2 = Table[Take[#, i], {i, Length@#}] &;

kguler's answer inspired me to write this:

Rest@FoldList[Append, {}, #] &

But this is slower than his method because of Mathematica's slow appends.

like image 42
Mr.Wizard Avatar answered Dec 10 '22 17:12

Mr.Wizard


Another idea:

Inits[l_] := Drop[Reverse@FixedPointList[
               If[Length[#] > 0, Most, Identity][#] &,
               l
             ], 2];

Update:

This version is a bit faster by omitting computing the length each time:

Inits2[l_] := Drop[Reverse@FixedPointList[
                If[# =!= {}, Most, Identity][#] &,
                l
              ], 2];
like image 32
Thies Heidecke Avatar answered Dec 10 '22 16:12

Thies Heidecke


Here is another method which is roughly as efficient as the one involving Take, but uses the Internal`Bag functionality:

AppendTo[$ContextPath, "Internal`"];
runsB[lst_] :=
   Module[{b = Bag[{}]}, Map[(StuffBag[b, #]; BagPart[b, All]) &, lst]];

I don't claim that it is simpler than the one based on Take, but it seems to be a simple example of Internal`Bag at work - since this is exactly the type of problem for which these can be successfully used (and there might be cases where lists of explicit positions would either not be available or expensive to compute).

Just to compare, the solution based on linked lists:

ClearAll[linkedList, runsLL];
SetAttributes[linkedList, HoldAllComplete];
runsLL[lst_] :=
  Map[List @@ Flatten[#, Infinity, linkedList] &,
    FoldList[linkedList, linkedList[First@lst], Rest@lst]]

will be an order of magnitude slower on large lists.

like image 21
Leonid Shifrin Avatar answered Dec 10 '22 15:12

Leonid Shifrin