I recently compared the performance of collections.Counter
to sorted
for comparison checks (if some iterable contains the same elements with the same amount) and while the big-iterable performance of Counter
is generally better than sorted
it's much slower for short iterables.
Using line_profiler
the bottleneck seems to be the isinstance(iterable, collections.Mapping)
-check in Counter.update
:
%load_ext line_profiler # IPython
lst = list(range(1000))
%lprun -f Counter.update Counter(lst)
gives me:
Timer unit: 5.58547e-07 s
Total time: 0.000244643 s
File: ...\lib\collections\__init__.py
Function: update at line 581
Line # Hits Time Per Hit % Time Line Contents
==============================================================
581 def update(*args, **kwds):
601 1 8 8.0 1.8 if not args:
602 raise TypeError("descriptor 'update' of 'Counter' object "
603 "needs an argument")
604 1 12 12.0 2.7 self, *args = args
605 1 6 6.0 1.4 if len(args) > 1:
606 raise TypeError('expected at most 1 arguments, got %d' % len(args))
607 1 5 5.0 1.1 iterable = args[0] if args else None
608 1 4 4.0 0.9 if iterable is not None:
609 1 72 72.0 16.4 if isinstance(iterable, Mapping):
610 if self:
611 self_get = self.get
612 for elem, count in iterable.items():
613 self[elem] = count + self_get(elem, 0)
614 else:
615 super(Counter, self).update(iterable) # fast path when counter is empty
616 else:
617 1 326 326.0 74.4 _count_elements(self, iterable)
618 1 5 5.0 1.1 if kwds:
619 self.update(kwds)
So even for length 1000 iterables it takes more than 15% of the time. For even shorter iterables (for example 20 items it increases to 60%).
I first thought it has something to do with how collections.Mapping
uses __subclasshook__
but that method isn't called after the first isinstance
-check anymore. So why is checking isinstance(iterable, Mapping)
so slow?
The performance is really just tied to a collection of checks in ABCMeta's __instancecheck__
, which is called by isinstance
.
The bottom line is that the poor performance witnessed here isn't a result of some missing optimization, but rather just a result of isinstance
with abstract base classes being a Python-level operation, as mentioned by Jim. Positive and negative results are cached, but even with cached results you're looking at a few microseconds per loop simply to traverse the conditionals in the __instancecheck__
method of the ABCMeta class.
Consider some different empty structures.
>>> d = dict; l = list(); s = pd.Series()
>>> %timeit isinstance(d, collections.abc.Mapping)
100000 loops, best of 3: 1.99 µs per loop
>>> %timeit isinstance(l, collections.abc.Mapping)
100000 loops, best of 3: 3.16 µs per loop # caching happening
>>> %timeit isinstance(s, collections.abc.Mapping)
100000 loops, best of 3: 3.26 µs per loop # caching happening
We can see the performance discrepancy - what accounts for it?
For a dict
>>> %lprun -f abc.ABCMeta.__instancecheck__ isinstance(dict(), collections.abc.Mapping)
Timer unit: 6.84247e-07 s
Total time: 1.71062e-05 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
178 def __instancecheck__(cls, instance):
179 """Override for isinstance(instance, cls)."""
180 # Inline the cache checking
181 1 7 7.0 28.0 subclass = instance.__class__
182 1 16 16.0 64.0 if subclass in cls._abc_cache:
183 1 2 2.0 8.0 return True
184 subtype = type(instance)
185 if subtype is subclass:
186 if (cls._abc_negative_cache_version ==
187 ABCMeta._abc_invalidation_counter and
188 subclass in cls._abc_negative_cache):
189 return False
190 # Fall back to the subclass check.
191 return cls.__subclasscheck__(subclass)
192 return any(cls.__subclasscheck__(c) for c in {subclass, subtype})
For a list
>>> %lprun -f abc.ABCMeta.__instancecheck__ isinstance(list(), collections.abc.Mapping)
Timer unit: 6.84247e-07 s
Total time: 3.07911e-05 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
178 def __instancecheck__(cls, instance):
179 """Override for isinstance(instance, cls)."""
180 # Inline the cache checking
181 1 7 7.0 15.6 subclass = instance.__class__
182 1 17 17.0 37.8 if subclass in cls._abc_cache:
183 return True
184 1 2 2.0 4.4 subtype = type(instance)
185 1 2 2.0 4.4 if subtype is subclass:
186 1 3 3.0 6.7 if (cls._abc_negative_cache_version ==
187 1 2 2.0 4.4 ABCMeta._abc_invalidation_counter and
188 1 10 10.0 22.2 subclass in cls._abc_negative_cache):
189 1 2 2.0 4.4 return False
190 # Fall back to the subclass check.
191 return cls.__subclasscheck__(subclass)
192 return any(cls.__subclasscheck__(c) for c in {subclass, subtype})
We can see that for a dict, the Mapping abstract classes' _abc_cache
>>> list(collections.abc.Mapping._abc_cache)
[dict]
includes our dict, and so the check short-circuits early. For a list evidently the positive cache won't be hit, however the Mapping's _abc_negative_cache
contains the list type
>>> list(collections.abc.Mapping._abc_negative_cache)
[type,
list,
generator,
pandas.core.series.Series,
itertools.chain,
int,
map]
as well as now the pd.Series type, as a result of calling isinstance
more than once with %timeit
. In the case that we don't hit the negative cache (like the first iteration for a Series), Python resorts to the regular subclass check with
cls.__subclasscheck__(subclass)
which can be far slower, resorting to the subclass hook and recursive subclass checks seen here, then caches the result for subsequent speedups.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With