Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DeleteDuplicates while retaining sublist structure

I want to DeleteDuplicates from a list of lists keeping the sublist structure intact.

E.g.

{{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}}

becomes

{{1, 7, 8}, {4, 3}, {9}, {2}}

This is the extent of nesting, and empty sublists are ok.

like image 398
stackexchange.michael Avatar asked Feb 16 '12 21:02

stackexchange.michael


3 Answers

This is a classic trick:

list = {{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}}

Module[{f},
 f[x_] := (f[x] = Sequence[]; x);
 Map[f, list, {2}]
]

To understand how it works, consider what happens when f[1] is evaluated for the first time. f[1] will evaluate to (f[1]=Sequence[]); 1, then to just 1. So now a definitions has been made for f[1]. The next time f[1] is evaluated, it will simply evaluate to Sequence[].

So, the first time f is evaluated for an argument, it returns that argument. The second (or third, etc.) time it is evaluated, it returns Sequence[]. Sequence[] has the property that it will get stripped out of expressions completely, i.e. {1, Sequence[], 3} will evaluate to {1, 3}.

To summarize, what the function does is: the first time it sees an element, it replaces the element with itself (leaves it alone). The second time it sees the same element, it removes it. I mapped this function at "level 2", so it will act only on the elements of sublists.

like image 126
Szabolcs Avatar answered Nov 03 '22 18:11

Szabolcs


You can use Complement for this and the only thing you have to do is, you have to store all elements you have already seen. At the beginning the list of all seen elements a is empty and then you add step by step all numbers which are in the sublists. This cumulated list is used in every step to delete all numbers which have already been seen:

DeleteDuplicatesList[l_List] :=
  Block[{a = {}},
   Table[With[{b = a},
     a = Union[a, c];
     Complement[c, b]], {c, l}]
   ];

l = {{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}};
DeleteDuplicatesList[l]
like image 36
halirutan Avatar answered Nov 03 '22 18:11

halirutan


It's not elegant but it gets the job done:

t = {{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}}; 

Delete[t, Flatten[Rest[Position[t, #]] & /@ (DeleteDuplicates[Flatten[t]]), 1]]

DeleteDuplicates...will return the set, not multi-set, of numbers in t. Rest[Position[...will return the positions of duplicates. Delete removes said duplicates.

like image 29
DavidC Avatar answered Nov 03 '22 18:11

DavidC