<blockquote> Possible Duplicate: Python: Retrieve items from a set </blockquote> Consider the following code: <pre class="prettyprint"><code>>>> item1 = (1,) >>> item2 = (2,) >>> s = set([item1, item2]) >>> s set([(2,), (1,)]) >>> new_item = (1,) >>> new_item in s True >>> new_item == item1 True >>> new_item is item1 False </code></pre> So <code>new_item</code> is in <code>s</code> because it is equivalent to one of its items, but it is a different object. What I want is to get <code>item1</code> from <code>s</code> given <code>new_item</code> is in <code>s</code>. One solution I have come up with is straightforward but not very efficient: <pre class="prettyprint"><code>def get_item(s, new_item): for item in s: if item == new_item: return item >>> get_item(s, new_item) is new_item False >>> get_item(s, new_item) is item1 True </code></pre> Another solution seems more efficient but actually does not work: <pre class="prettyprint"><code> def get_item_using_intersection1(s, new_item): return set([new_item]).intersection(s).pop() </code></pre> Nor this one: <pre class="prettyprint"><code> def get_item_using_intersection2(s, new_item): return s.intersection(set([new_item])).pop() </code></pre> Because intersection works in an undefined way: <pre class="prettyprint"><code>>>> get_item_using_intersection1(s, new_item) is new_item True >>> get_item_using_intersection1(s, new_item) is item1 False >>> get_item_using_intersection2(s, new_item) is new_item True >>> get_item_using_intersection2(s, new_item) is item1 False </code></pre> If this matters, I am using Python 2.7 x64 on Windows 7, but I need a cross-platform solution. <hr> Thanks to everyone. I came up with the following temporary solution: <pre class="prettyprint"><code>class SearchableSet(set): def find(self, item): for e in self: if e == item: return e </code></pre> which will be replaced in future with the following solution (which is very incomplete right now): <pre class="prettyprint"><code>class SearchableSet(object): def __init__(self, iterable=None): self.__data = {} if iterable is not None: for e in iterable: self.__data[e] = e def __iter__(self): return iter(self.__data) def __len__(self): return len(self.__data) def __sub__(self, other): return SearchableSet(set(self).__sub__(set(other))) def add(self, item): if not item in self: self.__data[item] = item def find(self, item): return self.__data.get(item) </code></pre>

Don't use a <code>set</code>, then. Just use a <code>dict</code> that maps some value to itself. In your case, it maps: <pre class="prettyprint"><code>d[item1] = item1 d[item2] = item2 </code></pre> So anything that's equal to <code>item1</code> will be found in <code>d</code>, but the value is <code>item1</code> itself. And it's much better than linear time ;-) P.S. I hope I understood the intention of your question correctly. If not, please clarify it.

Is there a way to get an item from a set in O(1) time? [duplicate]

Tags:

python

lookup

set

Possible Duplicate:
Python: Retrieve items from a set

Consider the following code:

>>> item1 = (1,)
>>> item2 = (2,)
>>> s = set([item1, item2])
>>> s
set([(2,), (1,)])
>>> new_item = (1,)
>>> new_item in s
True
>>> new_item == item1
True
>>> new_item is item1
False

So new_item is in s because it is equivalent to one of its items, but it is a different object.

What I want is to get item1 from s given new_item is in s.

One solution I have come up with is straightforward but not very efficient:

def get_item(s, new_item):
    for item in s:
        if item == new_item:
            return item

>>> get_item(s, new_item) is new_item
False
>>> get_item(s, new_item) is item1
True

Another solution seems more efficient but actually does not work:

 def get_item_using_intersection1(s, new_item):
     return set([new_item]).intersection(s).pop()

Nor this one:

 def get_item_using_intersection2(s, new_item):
     return s.intersection(set([new_item])).pop()

Because intersection works in an undefined way:

>>> get_item_using_intersection1(s, new_item) is new_item
True
>>> get_item_using_intersection1(s, new_item) is item1
False

>>> get_item_using_intersection2(s, new_item) is new_item
True
>>> get_item_using_intersection2(s, new_item) is item1
False

If this matters, I am using Python 2.7 x64 on Windows 7, but I need a cross-platform solution.

Thanks to everyone. I came up with the following temporary solution:

class SearchableSet(set):

    def find(self, item):
        for e in self:
            if e == item:
                return e

which will be replaced in future with the following solution (which is very incomplete right now):

class SearchableSet(object):

    def __init__(self, iterable=None):
        self.__data = {}
        if iterable is not None:
            for e in iterable:
                self.__data[e] = e

    def __iter__(self):
        return iter(self.__data)

    def __len__(self):
        return len(self.__data)

    def __sub__(self, other):
        return SearchableSet(set(self).__sub__(set(other)))

    def add(self, item):
        if not item in self:
            self.__data[item] = item

    def find(self, item):
        return self.__data.get(item)

954

asked Apr 30 '12 12:04

utapyngo

2 Answers

Don't use a set, then. Just use a dict that maps some value to itself. In your case, it maps:

d[item1] = item1
d[item2] = item2

So anything that's equal to item1 will be found in d, but the value is item1 itself. And it's much better than linear time ;-)

P.S. I hope I understood the intention of your question correctly. If not, please clarify it.

186

answered Oct 19 '22 00:10

Eli Bendersky

If you absolutely need the O(1) lookup and object identity (not just equality) and fast set operations (without having to create new sets each time you want to do set operations), then one fairly straightforward approach is to use both a dict and a set. You would have to maintain both structures to keep them in sync, but this would allow you to keep O(1) access (just with a bigger constant factor). (And maybe this is what you are heading toward with your "future solution which is very incomplete right now" in your edit.)

However, you haven't mentioned the volume of data you're working with, or what kind of performance problems you're having, if any. So I'm not convinced you really need to do this. It could be that dict with as-needed set creation, or set with linear lookup, is already fast enough.

answered Oct 18 '22 23:10

John Y

Related questions
                            
                                Using Django ORM get_or_create with multiple databases
                            
                                Python topological sort using lists indicating edges
                            
                                Can I use a dynamic mapping to unpack keyword arguments in Python?
                            
                                Will the function in python for loop be executed multiple times?
                            
                                How to reverse geocode serverside with python, json and google maps?
                            
                                Matplotlib animations - how to export them to a format to use in a presentation?
                            
                                LXML and XSL document() Function
                            
                                Python FileCookieJar.save() issue
                            
                                Store exception body in variable
                            
                                How to extract movie title from file name
                            
                                Combined list and dict comprehension
                            
                                Dynamically get dict elements via getattr?
                            
                                Python algorithm of counting occurrence of specific word in csv
                            
                                Source code for Python's modules
                            
                                Count occurrences of a couple of specific words
                            
                                Calculate Hitting Time between 2 nodes using NetworkX
                            
                                python - beginner - integrating optparse in a program
                            
                                Why can't I add a tuple to a list with the '+' operator in Python?
                            
                                how to convert raw images to png in python?
                            
                                Creating list from retrlines in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With