I have a list of tuples, each tuple of which contains one string and two integers. The list looks like this:
x = [('a',1,2), ('b',3,4), ('x',5,6), ('a',2,1)]
The list contains thousands of such tuples. Now if I want to get unique combinations, I can do the frozenset
on my list as follows:
y = set(map(frozenset, x))
This gives me the following result:
{frozenset({'a', 2, 1}), frozenset({'x', 5, 6}), frozenset({3, 'b', 4})}
I know that set is an unordered data structure and this is normal case but I want to preserve the order of the elements here so that I can thereafter insert the elements in a pandas
dataframe. The dataframe will look like this:
Name Marks1 Marks2
0 a 1 2
1 b 3 4
2 x 5 6
While elements of a set can be modified at any time, elements of the frozen set remain the same after creation. Due to this, frozen sets can be used as keys in Dictionary or as elements of another set. But like sets, it is not ordered (the elements can be set at any index).
Both are unordered and unindexed. Frozenset objects are immutable (they can't be changed). The order of elements is not guaranteed to be preserved.
Frozenset is similar to set in Python, except that frozensets are immutable, which implies that once generated, elements from the frozenset cannot be added or removed. This function accepts any iterable object as input and transforms it into an immutable object.
Python frozenset object is an immutable unordered collection of data elements. Therefore, you cannot modify the elements of the frozenset. To convert this set into a list, you have to use the list function and pass the set as a parameter to get the list object as an output.
Instead of operating on the set
of frozenset
s directly you could use that only as a helper data-structure - like in the unique_everseen
recipe in the itertools section (copied verbatim):
from itertools import filterfalse
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
Basically this would solve the issue when you use key=frozenset
:
>>> x = [('a',1,2), ('b',3,4), ('x',5,6), ('a',2,1)]
>>> list(unique_everseen(x, key=frozenset))
[('a', 1, 2), ('b', 3, 4), ('x', 5, 6)]
This returns the elements as-is and it also maintains the relative order between the elements.
No ordering with frozensets. You can instead create sorted tuples to check for the existence of an item, adding the original if the tuple does not exist in the set:
y = set()
lst = []
for i in x:
t = tuple(sorted(i, key=str)
if t not in y:
y.add(t)
lst.append(i)
print(lst)
# [('a', 1, 2), ('b', 3, 4), ('x', 5, 6)]
The first entry gets preserved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With