Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregating Tally counters

Many times I find myself counting occurrences with Tally[ ] and then, once I discarded the original list, having to add (and join) to that counters list the results from another list.

This typically happens when I am counting configurations, occurrences, doing some discrete statistics, etc.

So I defined a very simple but handy function for Tally aggregation:

aggTally[listUnTallied__List:{}, 
         listUnTallied1_List,
         listTallied_List] := 
 Join[Tally@Join[listUnTallied, listUnTallied1], listTallied] //. 
     {a___, {x_, p_}, b___, {x_, q_}, c___} -> {a, {x, p + q}, b, c};

Such that

l = {x, y, z}; lt = Tally@l;
n = {x};
m = {x, y, t};

aggTally[n, {}]
  {{x, 1}}

aggTally[m, n, {}]
  {{x, 2}, {y, 1}, {t, 1}}

aggTally[m, n, lt]
  {{x, 3}, {y, 2}, {t, 1}, {z, 1}}

This function has two problems:

1) Performance

Timing[Fold[aggTally[Range@#2, #1] &, {}, Range[100]];]
  {23.656, Null}
(* functional equivalent to *)
Timing[s = {}; j = 1; While[j < 100, s = aggTally[Range@j, s]; j++]]
  {23.047, Null}

2) It does not validate that the last argument is a real Tallied list or null (less important for me, though)

Is there a simple, elegant, faster and more effective solution? (I understand that these are too many requirements, but wishing is free)

like image 359
Dr. belisarius Avatar asked Feb 28 '11 14:02

Dr. belisarius


1 Answers

Perhaps, this will suit your needs?

aggTallyAlt[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
{#[[1, 1]], Total@#[[All, 2]]} & /@ 
       GatherBy[Join[Tally@Join[listUnTallied, listUnTallied1], listTallied], First]

The timings are much better, and there is a pattern-based check on the last arg.

EDIT:

Here is a faster version:

aggTallyAlt1[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
Transpose[{#[[All, 1, 1]], Total[#[[All, All, 2]], {2}]}] &@
   GatherBy[Join[Tally@Join[listUnTallied, listUnTallied1], listTallied], First]

The timings for it:

In[39]:= Timing[Fold[aggTallyAlt1[Range@#2, #1] &, {}, Range[100]];]
Timing[s = {}; j = 1; While[j < 100, s = aggTallyAlt1[Range@j, s]; j++]]

Out[39]= {0.015, Null}

Out[40]= {0.016, Null}
like image 138
Leonid Shifrin Avatar answered Nov 10 '22 02:11

Leonid Shifrin