I am performing multiple iterations of the type:
masterSet=masterSet.union(setA)
As the set grows the length of time taken to perform these operations is growing (as one would expect, I guess).
I expect that the time is taken up checking whether each element of setA is already in masterSet?
My question is that if i KNOW that masterSet does not already contain any of elements in setA can I do this quicker?
[UPDATE]
Given that this question is still attracting views I thought I would clear up a few of the things from the comments and answers below:
When iterating though there were many iterations where I knew setA
would be distinct from masterSet
because of how it was constructed (without having to process any checks) but a few iterations I needed the uniqueness check.
I wondered if there was a way to 'tell' the masterSet.union()
procedure not to bother with the uniquness check this time around as I know this one is distinct from masterSet
just add these elements quickly trusting the programmer's assertion they were definately distict. Perhpas through calling some different ".unionWithDistinctSet()
" procedure or something.
I think the responses have suggested that this isnt possible (and that really set operations should be quick enough anyway) but to use masterSet.update(setA)
instead of union as its slightly quicker still.
I have accepted the clearest reponse along those lines, resolved the issue I was having at the time and got on with my life but would still love to hear if my hypothesised .unionWithDistinctSet()
could ever exist?
Unlike list extend(), there is no extend function in the Python set. However, you can use Union, Intersection, Difference, or Symmetric difference method to extend the set in Python.
To add multiple elements at once we use the Set update() method. It takes an iterable(list, tuple, dictionary) as an argument. We can add single or multiply iterable in the set using the Update() method.
Set is an unordered collection, so added elements can be in any order. The add() method can add a tuple object as an element in the set, as shown below. Note that you can add the same tuple only once as in the case of other elements. Lists and Dictionaries cannot be added to the set because they are unhashable.
To create an empty set in python we have to use the set() function without any arguments, if we will use empty curly braces ” {} ” then we will get an empty dictionary. After writing the above code (create an empty set in python), Ones you will print “type(x)” then the output will appear as a “ <class 'set'> ”.
If you know your elements are unique, a set is not necessarily the best structure. A simple list is way faster to extend. Show activity on this post. For sure, forgoing this check could be a big saving when the __eq__ (..) method is very expensive.
As mgilson points out, you can use update to update a set in-place from another set. That actually works out slightly quicker: Show activity on this post. If you know your elements are unique, a set is not necessarily the best structure. A simple list is way faster to extend.
The hashing-based solution requires O (n) extra space. We can use bitwise AND to find the unique element in O (n) time and constant extra space. Create an array count [] of size equal to number of bits in binary representations of numbers. Fill count array such that count [i] stores count of array elements with i-th bit set.
The idea is to traverse the given array from left to right and keep track of visited elements in a hash table. Finally, print the element with count 1. The hashing-based solution requires O (n) extra space. We can use bitwise AND to find the unique element in O (n) time and constant extra space.
You can use set.update
to update your master set in place. This saves allocating a new set all the time so it should be a little faster than set.union
...
>>> s = set(range(3)) >>> s.update(range(4)) >>> s set([0, 1, 2, 3])
Of course, if you're doing this in a loop:
masterSet = set() for setA in iterable: masterSet = masterSet.union(setA)
You might get a performance boost by doing something like:
masterSet = set().union(*iterable)
Ultimately, membership testing of a set is O(1) (in the average case), so testing if the element is already contained in the set isn't really a big performance hit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With