Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the purpose of collections.ChainMap?

In Python 3.3 a ChainMap class was added to the collections module:

A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.

Example:

>>> from collections import ChainMap >>> x = {'a': 1, 'b': 2} >>> y = {'b': 10, 'c': 11} >>> z = ChainMap(y, x) >>> for k, v in z.items():         print(k, v) a 1 c 11 b 10 

It was motivated by this issue and made public by this one (no PEP was created).

As far as I understand, it is an alternative to having an extra dictionary and maintaining it with update()s.

The questions are:

  • What use cases does ChainMap cover?
  • Are there any real world examples of ChainMap?
  • Is it used in third-party libraries that switched to python3?

Bonus question: is there a way to use it on Python2.x?


I've heard about it in Transforming Code into Beautiful, Idiomatic Python PyCon talk by Raymond Hettinger and I'd like to add it to my toolkit, but I lack in understanding when should I use it.

like image 811
alecxe Avatar asked Apr 30 '14 16:04

alecxe


People also ask

What is a ChainMap in Python?

ChainMap is a data structure provided by the Python standard library that allows you to treat multiple dictionaries as one. The official documentation on ChainMap reads: A ChainMap groups multiple dicts or other mappings together to create a single, updateable view.

What is a collection in Python?

Collections is a built-in python module that provides useful container datatypes. Container datatypes allow us to store and access values in a convenient way. Generally, you would have used lists, tuples, and dictionaries. But, while dealing with structured data we need smarter objects.


1 Answers

I like @b4hand's examples, and indeed I have used in the past ChainMap-like structures (but not ChainMap itself) for the two purposes he mentions: multi-layered configuration overrides, and variable stack/scope emulation.

I'd like to point out two other motivations/advantages/differences of ChainMap, compared to using a dict-update loop, thus only storing the "final" version":

  1. More information: since a ChainMap structure is "layered", it supports answering question like: Am I getting the "default" value, or an overridden one? What is the original ("default") value? At what level did the value get overridden (borrowing @b4hand's config example: user-config or command-line-overrides)? Using a simple dict, the information needed for answering these questions is already lost.

  2. Speed tradeoff: suppose you have N layers and at most M keys in each, constructing a ChainMap takes O(N) and each lookup O(N) worst-case[*], while construction of a dict using an update-loop takes O(NM) and each lookup O(1). This means that if you construct often and only perform a few lookups each time, or if M is big, ChainMap's lazy-construction approach works in your favor.

[*] The analysis in (2) assumes dict-access is O(1), when in fact it is O(1) on average, and O(M) worst case. See more details here.

like image 187
shx2 Avatar answered Oct 04 '22 16:10

shx2