Possible Duplicate:
Python: Retrieve items from a set
Consider the following code:
>>> item1 = (1,)
>>> item2 = (2,)
>>> s = set([item1, item2])
>>> s
set([(2,), (1,)])
>>> new_item = (1,)
>>> new_item in s
True
>>> new_item == item1
True
>>> new_item is item1
False
So new_item
is in s
because it is equivalent to one of its items, but it is a different object.
What I want is to get item1
from s
given new_item
is in s
.
One solution I have come up with is straightforward but not very efficient:
def get_item(s, new_item):
for item in s:
if item == new_item:
return item
>>> get_item(s, new_item) is new_item
False
>>> get_item(s, new_item) is item1
True
Another solution seems more efficient but actually does not work:
def get_item_using_intersection1(s, new_item):
return set([new_item]).intersection(s).pop()
Nor this one:
def get_item_using_intersection2(s, new_item):
return s.intersection(set([new_item])).pop()
Because intersection works in an undefined way:
>>> get_item_using_intersection1(s, new_item) is new_item
True
>>> get_item_using_intersection1(s, new_item) is item1
False
>>> get_item_using_intersection2(s, new_item) is new_item
True
>>> get_item_using_intersection2(s, new_item) is item1
False
If this matters, I am using Python 2.7 x64 on Windows 7, but I need a cross-platform solution.
Thanks to everyone. I came up with the following temporary solution:
class SearchableSet(set):
def find(self, item):
for e in self:
if e == item:
return e
which will be replaced in future with the following solution (which is very incomplete right now):
class SearchableSet(object):
def __init__(self, iterable=None):
self.__data = {}
if iterable is not None:
for e in iterable:
self.__data[e] = e
def __iter__(self):
return iter(self.__data)
def __len__(self):
return len(self.__data)
def __sub__(self, other):
return SearchableSet(set(self).__sub__(set(other)))
def add(self, item):
if not item in self:
self.__data[item] = item
def find(self, item):
return self.__data.get(item)
A Set is a Collection that cannot contain duplicate elements.
Don't use a set
, then. Just use a dict
that maps some value to itself. In your case, it maps:
d[item1] = item1
d[item2] = item2
So anything that's equal to item1
will be found in d
, but the value is item1
itself. And it's much better than linear time ;-)
P.S. I hope I understood the intention of your question correctly. If not, please clarify it.
If you absolutely need the O(1) lookup and object identity (not just equality) and fast set operations (without having to create new sets each time you want to do set operations), then one fairly straightforward approach is to use both a dict
and a set
. You would have to maintain both structures to keep them in sync, but this would allow you to keep O(1) access (just with a bigger constant factor). (And maybe this is what you are heading toward with your "future solution which is very incomplete right now" in your edit.)
However, you haven't mentioned the volume of data you're working with, or what kind of performance problems you're having, if any. So I'm not convinced you really need to do this. It could be that dict
with as-needed set
creation, or set
with linear lookup, is already fast enough.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With