Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to get an item from a set in O(1) time? [duplicate]

Tags:

python

lookup

set

Possible Duplicate:
Python: Retrieve items from a set

Consider the following code:

>>> item1 = (1,)
>>> item2 = (2,)
>>> s = set([item1, item2])
>>> s
set([(2,), (1,)])
>>> new_item = (1,)
>>> new_item in s
True
>>> new_item == item1
True
>>> new_item is item1
False

So new_item is in s because it is equivalent to one of its items, but it is a different object.

What I want is to get item1 from s given new_item is in s.

One solution I have come up with is straightforward but not very efficient:

def get_item(s, new_item):
    for item in s:
        if item == new_item:
            return item

>>> get_item(s, new_item) is new_item
False
>>> get_item(s, new_item) is item1
True

Another solution seems more efficient but actually does not work:

 def get_item_using_intersection1(s, new_item):
     return set([new_item]).intersection(s).pop()

Nor this one:

 def get_item_using_intersection2(s, new_item):
     return s.intersection(set([new_item])).pop()

Because intersection works in an undefined way:

>>> get_item_using_intersection1(s, new_item) is new_item
True
>>> get_item_using_intersection1(s, new_item) is item1
False

>>> get_item_using_intersection2(s, new_item) is new_item
True
>>> get_item_using_intersection2(s, new_item) is item1
False

If this matters, I am using Python 2.7 x64 on Windows 7, but I need a cross-platform solution.


Thanks to everyone. I came up with the following temporary solution:

class SearchableSet(set):

    def find(self, item):
        for e in self:
            if e == item:
                return e

which will be replaced in future with the following solution (which is very incomplete right now):

class SearchableSet(object):

    def __init__(self, iterable=None):
        self.__data = {}
        if iterable is not None:
            for e in iterable:
                self.__data[e] = e

    def __iter__(self):
        return iter(self.__data)

    def __len__(self):
        return len(self.__data)

    def __sub__(self, other):
        return SearchableSet(set(self).__sub__(set(other)))

    def add(self, item):
        if not item in self:
            self.__data[item] = item

    def find(self, item):
        return self.__data.get(item)
like image 954
utapyngo Avatar asked Apr 30 '12 12:04

utapyngo


People also ask

Can there be duplicate values in a set?

A Set is a Collection that cannot contain duplicate elements.


2 Answers

Don't use a set, then. Just use a dict that maps some value to itself. In your case, it maps:

d[item1] = item1
d[item2] = item2

So anything that's equal to item1 will be found in d, but the value is item1 itself. And it's much better than linear time ;-)

P.S. I hope I understood the intention of your question correctly. If not, please clarify it.

like image 186
Eli Bendersky Avatar answered Oct 19 '22 00:10

Eli Bendersky


If you absolutely need the O(1) lookup and object identity (not just equality) and fast set operations (without having to create new sets each time you want to do set operations), then one fairly straightforward approach is to use both a dict and a set. You would have to maintain both structures to keep them in sync, but this would allow you to keep O(1) access (just with a bigger constant factor). (And maybe this is what you are heading toward with your "future solution which is very incomplete right now" in your edit.)

However, you haven't mentioned the volume of data you're working with, or what kind of performance problems you're having, if any. So I'm not convinced you really need to do this. It could be that dict with as-needed set creation, or set with linear lookup, is already fast enough.

like image 40
John Y Avatar answered Oct 18 '22 23:10

John Y