Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python "set" with duplicate/repeated elements

Is there a standard way to represent a "set" that can contain duplicate elements.

As I understand it, a set has exactly one or zero of an element. I want functionality to have any number.

I am currently using a dictionary with elements as keys, and quantity as values, but this seems wrong for many reasons.

Motivation: I believe there are many applications for such a collection. For example, a survey of favourite colours could be represented by: survey = ['blue', 'red', 'blue', 'green']

Here, I do not care about the order, but I do about quantities. I want to do things like:

survey.add('blue') # would give survey == ['blue', 'red', 'blue', 'green', 'blue'] 

...and maybe even

survey.remove('blue') # would give survey == ['blue', 'red', 'green'] 

Notes: Yes, set is not the correct term for this kind of collection. Is there a more correct one?

A list of course would work, but the collection required is unordered. Not to mention that the method naming for sets seems to me to be more appropriate.

like image 994
cammil Avatar asked Apr 16 '12 14:04

cammil


People also ask

Can a set have duplicate elements Python?

A set is an unordered collection of items. Every set element is unique (no duplicates) and must be immutable (cannot be changed).

Is set allow duplicate elements & list allow duplicate elements in Python?

Like a list, a set allows the addition and removal of elements. However, there are a few unique characteristics that define a set and separate it from other data structures: A set does not hold duplicate items.

How many duplicate elements are in a set Python?

In Python, a Set is an unordered collection of data types that is iterable, mutable and has no duplicate elements. The order of elements in a set is undefined though it may consist of various elements.

Can a set have duplicate elements?

A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction.


1 Answers

You are looking for a multiset.

Python's closest datatype is collections.Counter:

A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.

For an actual implementation of a multiset, use the bag class from the data-structures package on pypi. Note that this is for Python 3 only. If you need Python 2, here is a recipe for a bag written for Python 2.4.

like image 190
Steven Rumbalski Avatar answered Oct 13 '22 00:10

Steven Rumbalski