In Mathematica I have a list:
x = {1,2,3,3,4,5,5,6}
How will I make a list with the duplicates? Like:
{3,5}
I have been looking at Lists as Sets, if there is something like Except[] for lists, so I could do:
unique = Union[x]
duplicates = MyExcept[x,unique]
(Of course, if the x would have more than two duplicates - say, {1,2,2,2,3,4,4}, there the output would be {2,2,4}, but additional Union[] would solve this.)
But there wasn't anything like that (if I did understand all the functions there well).
So, how to do that?
Here are several faster variations of the Tally method.
f4
uses "tricks" given by Carl Woll and Oliver Ruebenkoenig on MathGroup.
f2 = Tally@# /. {{_, 1} :> Sequence[], {a_, _} :> a} &;
f3 = Pick[#, Unitize[#2 - 1], 1] & @@ Transpose@Tally@# &;
f4 = # ~Extract~ SparseArray[Unitize[#2 - 1]]["NonzeroPositions"] & @@ Transpose@Tally@# &;
Speed comparison (f1
included for reference)
a = RandomInteger[100000, 25000];
f1 = Part[Select[Tally@#, Part[#, 2] > 1 &], All, 1] &;
First@Timing@Do[#@a, {50}] & /@ {f1, f2, f3, f4, Tally}
SameQ @@ (#@a &) /@ {f1, f2, f3, f4}
Out[]= {3.188, 1.296, 0.719, 0.375, 0.36}
Out[]= True
It is amazing to me that f4
has almost no overhead relative to a pure Tally
!
Lots of ways to do list extraction like this; here's the first thing that came to my mind:
Part[Select[Tally@x, Part[#, 2] > 1 &], All, 1]
Or, more readably in pieces:
Tally@x Select[%, Part[#, 2] > 1 &] Part[%, All, 1]
which gives, respectively,
{{1, 1}, {2, 1}, {3, 2}, {4, 1}, {5, 2}, {6, 1}} {{3, 2}, {5, 2}} {3, 5}
Perhaps you can think of a more efficient (in time or code space) way :)
By the way, if the list is unsorted then you need run Sort
on it first before this will work.
Here's a way to do it in a single pass through the list:
collectDups[l_] := Block[{i}, i[n_]:= (i[n] = n; Unevaluated@Sequence[]); i /@ l]
For example:
collectDups[{1, 1, 6, 1, 3, 4, 4, 5, 4, 4, 2, 2}] --> {1, 1, 4, 4, 4, 2}
If you want the list of unique duplicates -- {1, 4, 2}
-- then wrap the above in DeleteDuplicates
, which is another single pass through the list (Union
is less efficient as it also sorts the result).
collectDups[l_] :=
DeleteDuplicates@Block[{i}, i[n_]:= (i[n] = n; Unevaluated@Sequence[]); i /@ l]
Will Robertson's solution is probably better just because it's more straightforward, but I think if you wanted to eek out more speed, this should win. But if you cared about that, you wouldn't be programming in Mathematica! :)
Using a solution like dreeves, but only returning a single instance of each duplicated element, is a bit on the tricky side. One way of doing it is as follows:
collectDups1[l_] :=
Module[{i, j},
i[n_] := (i[n] := j[n]; Unevaluated@Sequence[]);
j[n_] := (j[n] = Unevaluated@Sequence[]; n);
i /@ l];
This doesn't precisely match the output produced by Will Robertson's (IMO superior) solution, because elements will appear in the returned list in the order that it can be determined that they're duplicates. I'm not sure if it really can be done in a single pass, all the ways I can think of involve, in effect, at least two passes, although one might only be over the duplicated elements.
Here is a version of Robertson's answer that uses 100% "postfix notation" for function calls.
identifyDuplicates[list_List, test_:SameQ] :=
list //
Tally[#, test] & //
Select[#, #[[2]] > 1 &] & //
Map[#[[1]] &, #] &
Mathematica's //
is similar to the dot for method calls in other languages. For instance, if this were written in C# / LINQ style, it would resemble
list.Tally(test).Where(x => x[2] > 1).Select(x => x[1])
Note that C#'s Where
is like MMA's Select
, and C#'s Select
is like MMA's Map
.
EDIT: added optional test function argument, defaulting to SameQ
.
EDIT: here is a version that addresses my comment below & reports all the equivalents in a group given a projector function that produces a value such that elements of the list are considered equivalent if the value is equal. This essentially finds equivalence classes longer than a given size:
reportDuplicateClusters[list_List, projector_: (# &),
minimumClusterSize_: 2] :=
GatherBy[list, projector] //
Select[#, Length@# >= minimumClusterSize &] &
Here is a sample that checks pairs of integers on their first elements, considering two pairs equivalent if their first elements are equal
reportDuplicateClusters[RandomInteger[10, {10, 2}], #[[1]] &]
This thread seems old, but I've had to solve this myself.
This is kind of crude, but does this do it?
Union[Select[Table[If[tt[[n]] == tt[[n + 1]], tt[[n]], ""], {n, Length[tt] - 1}], IntegerQ]]
Given a list A,
get the non-duplicate values in B
B = DeleteDuplicates[A]
get the duplicate values in C
C = Complement[A,B]
get the non-duplicate values from the duplicate list in D
D = DeleteDuplicates[C]
So for your example:
A = 1, 2, 2, 2, 3, 4, 4
B = 1, 2, 3, 4
C = 2, 2, 4
D = 2, 4
so your answer would be DeleteDuplicates[Complement[x,DeleteDuplicates[x]]] where x is your list. I don't know mathematica, so the syntax may or may not be perfect here. Just going by the docs on the page you linked to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With