Presumably dict_keys are supposed to behave as a set-like object, but they are lacking the difference
method and the subtraction behaviour seems to diverge.
>>> d = {0: 'zero', 1: 'one', 2: 'two', 3: 'three'}
>>> d.keys() - [0, 2]
{1, 3}
>>> d.keys() - (0, 2)
TypeError: 'int' object is not iterable
Why does dict_keys class try to iterate an integer here? Doesn't that violate duck-typing?
>>> dict.fromkeys(['0', '1', '01']).keys() - ('01',)
{'01'}
>>> dict.fromkeys(['0', '1', '01']).keys() - ['01',]
{'1', '0'}
Explanation. Two tuples are defined, and are displayed on the console. The lambda function is used to subtract each of the corresponding elements from the two tuples.
No, there is no guaranteed order for the list of keys returned by the keys() function. In most cases, the key list is returned in the same order as the insertion, however, that behavior is NOT guaranteed and should not be depended on by your program.
This looks to be a bug. The implementation is to convert the dict_keys
to a set
, then call .difference_update(arg)
on it.
It looks like they misused _PyObject_CallMethodId
(an optimized variant of PyObject_CallMethod
), by passing a format string of just "O"
. Thing is, PyObject_CallMethod
and friends are documented to require a Py_BuildValue
format string that "should produce a tuple
". With more than one format code, it wraps the values in a tuple
automatically, but with only one format code, it doesn't tuple
, it just creates the value (in this case, because it's already PyObject*
, all it does is increment the reference count).
While I haven't tracked down where it might be doing this, I suspect somewhere in the internals it's identifying CallMethod
calls that don't produce a tuple
and wrapping them to make a one element tuple
so the called function can actually receive the arguments in the expected format. When subtracting a tuple
, it's already a tuple
, and this fix up code never activates; when passing a list
, it does, becoming a one element tuple
containing the list
.
difference_update
takes varargs (as if it were declared def difference_update(self, *args)
). So when it receives the unwrapped tuple
, it thinks it's supposed to subtract away the elements from each entry in the tuple
, not treat said entries as values to subtract away themselves. To illustrate, when you do:
mydict.keys() - (1, 2)
the bug is causing it to do (roughly):
result = set(mydict)
# We've got a tuple to pass, so all's well...
result.difference_update(*(1, 2)) # Unpack behaves like difference_update(1, 2)
# OH NO!
While:
mydict.keys() - [1, 2]
does:
result = set(mydict)
# [1, 2] isn't a tuple, so wrap
result.difference_update(*([1, 2],)) # Behaves like difference_update([1, 2])
# All's well
That's why a tuple
of str
works (incorrectly), - ('abc', '123')
is performing a call equivalent to:
result.difference_update(*('abc', '123'))
# or without unpacking:
result.difference_update('abc', '123')
and since str
s are iterables of their characters, it just blithely removes entries for 'a'
, 'b'
, 'c'
, etc. instead of 'abc'
and '123'
like you expected.
Basically, this is a bug; it's filed against the CPython folks and fixed in 3.6.0 (as well as later releases of 2.7, 3.4, and 3.5).
The correct behavior probably should have been to call (assuming this Id
variant exists for this API):
_PyObject_CallMethodObjArgsId(result, &PyId_difference_update, other, NULL);
which wouldn't have the packing issues at all, and would run faster to boot; the smallest change would be to change the format string to "(O)"
to force tuple
creation even for a single item, but since the format string gains nothing, _PyObject_CallMethodObjArgsId
is better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With