From the python docs, "set.pop() remove and return an arbitrary element from s". While generating some random data to test a program, I noticed strange behavior of this pop() function. Here is my code (python 2.7.3):
testCases = 10
numberRange = 500
poppedValues = []
greaterPercentages = []
for i in range (testCases):
s = Set()
""" inserting 100 random values in the set, in the range [0, numberRange) """
for j in range (100):
s.add(random.randrange(numberRange))
poppedValue = s.pop()
greaterCount = 0
""" counting how many numbers in the set are smaller then the popped value """
for number in s:
if poppedValue > number:
greaterCount += 1
poppedValues.append(poppedValue)
greaterPercentages.append(float(greaterCount) / len(s) * 100)
for poppedValue in poppedValues:
print poppedValue, '\t',
print
for percentage in greaterPercentages:
print "{:2.2f}".format(percentage), '\t',
What I'm doing here is,
s
where each element is in the range [0, numberRange
)I expected that the popped value should be a random one and about 50% of the numbers in the set will be greater then the popped value. But seems that pop()
almost always returns the lowest number in the set. Here are the result for numberRange = 500
. First row denotes the values of the popped element. Second row is the percentage of elements which are smaller then the popped value.
9 0 3 1 409 0 1 2 4 0
0 % 0 % 0 % 0 % 87 % 0 % 0 % 0 % 0 % 0 %
I've conducted this test with different values of numberRange
. It seems that for lower values of the set elements, pop()
almost always returns the lowest element. But for higher values it returns a random element. For numberRange = 1000
, the result is:
518 3586 3594 4103 2560 3087 4095 3079 3076 1622
7 % 72 % 73 % 84 % 54 % 51 % 79 % 63 % 67 % 32 %
which I think is pretty random. Why this strange behavior? Am I doing something wrong?
EDIT: Thanks for everyone's answer and comment, seems that by "arbitrarily", it isn't guaranteed that it will be "random".
Python Set pop() The pop() method randomly removes an item from a set and returns the removed item.
No, it is not random. It is "arbitrarily ordered", which means that you cannot depend on it being either ordered or random.
This is because you append 6 to the end or top of the stack / list / array . A pop command follows the last in first out rule which means when you pop an item from the stack / list /array you will be taking out the last item.
Python set pop() method removes a random element from the set when it is called. The pop() method removes any item anywhere from the set.
It's an implementation detail - set
is implemented as a HashMap (similar to dict
but without a slot for a value), set.pop
removes the first entry in the HashMap, and an int
s hash value is the same int.
Combined, this means that your set
, which is ordered by the hash values, is actually ordered by the entries modulo hashtable size as well; this should be close to natural ordering in your case as you are only inserting numbers from a small range - if you take random numbers from randrange(10**10)
instead of randrange(500)
you should see a different behaviour. Also, depending on your insertion order, you can get some values out of their original hashing order due to hash collisions.
When the doc says:
remove and return an arbitrary element from s; raises KeyError if empty
that means the behaviour isn't defined and the implementation can do whatever it's possible. In this case, it seems that the implemented behaviour is to remove the smallest value. That's all.
In fact, set.pop()
is based on a HashMap
and remove the first element of this (which is the smaller hashcode). In the case of set
of ints, it's the smallest int
.
On other implementation of Python could return a random value or the first pushed... You can't know.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With