Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OrderedDict comprehensions

Can I extend syntax in python for dict comprehensions for other dicts, like the OrderedDict in collections module or my own types which inherit from dict?

Just rebinding the dict name obviously doesn't work, the {key: value} comprehension syntax still gives you a plain old dict for comprehensions and literals.

>>> from collections import OrderedDict >>> olddict, dict = dict, OrderedDict >>> {i: i*i for i in range(3)}.__class__ <type 'dict'> 

So, if it's possible how would I go about doing that? It's OK if it only works in CPython. For syntax I guess I would try it with a O{k: v} prefix like we have on the r'various' u'string' b'objects'.

note: Of course we can use a generator expression instead, but I'm more interested seeing how hackable python is in terms of the grammar.

like image 311
wim Avatar asked Jan 13 '14 23:01

wim


People also ask

What is OrderedDict used for?

Sometimes you need a Python dictionary that remembers the order of its items. In the past, you had only one tool for solving this specific problem: Python's OrderedDict . It's a dictionary subclass specially designed to remember the order of items, which is defined by the insertion order of keys.

What is collections OrderedDict?

An OrderedDict is a dictionary subclass that remembers the order in which its contents are added. import collections print 'Regular dictionary:' d = {} d['a'] = 'A' d['b'] = 'B' d['c'] = 'C' d['d'] = 'D' d['e'] = 'E' for k, v in d. items(): print k, v print '\nOrderedDict:' d = collections.

What is the difference between dict and OrderedDict?

The OrderedDict is a subclass of dict object in Python. The only difference between OrderedDict and dict is that, in OrderedDict, it maintains the orders of keys as inserted. In the dict, the ordering may or may not be happen. The OrderedDict is a standard library class, which is located in the collections module.

Is OrderedDict slower than dict?

OrderedDict is over 80% slower than the standard Python dictionary (8.6/4.7≈1.83).


2 Answers

Sorry, not possible. Dict literals and dict comprehensions map to the built-in dict type, in a way that's hardcoded at the C level. That can't be overridden.

You can use this as an alternative, though:

OrderedDict((i, i * i) for i in range(3)) 

Addendum: as of Python 3.6, all Python dictionaries are ordered. As of 3.7, it's even part of the language spec. If you're using those versions of Python, no need for OrderedDict: the dict comprehension will Just Work (TM).

like image 158
Max Noel Avatar answered Sep 20 '22 00:09

Max Noel


There is no direct way to change Python's syntax from within the language. A dictionary comprehension (or plain display) is always going to create a dict, and there's nothing you can do about that. If you're using CPython, it's using special bytecodes that generate a dict directly, which ultimately call the PyDict API functions and/or the same underlying functions used by that API. If you're using PyPy, those bytecodes are instead implemented on top of an RPython dict object which in turn is implemented on top of a compiled-and-optimized Python dict. And so on.

There is an indirect way to do it, but you're not going to like it. If you read the docs on the import system, you'll see that it's the importer that searches for cached compiled code or calls the compiler, and the compiler that calls the parser, and so on. In Python 3.3+, almost everything in this chain either is written in pure Python, or has an alternate pure Python implementation, meaning you can fork the code and do your own thing. Which includes parsing source with your own PyParsing code that builds ASTs, or compiling a dict comprehension AST node into your own custom bytecode instead of the default, or post-processing the bytecode, or…

In many cases, an import hook is sufficient; if not, you can always write a custom finder and loader.

If you're not already using Python 3.3 or later, I'd strongly suggest migrating before playing with this stuff. In older versions, it's harder, and less well documented, and you'll ultimately be putting in 10x the effort to learn something that will be obsolete whenever you do migrate.

Anyway, if this approach sounds interesting to you, you might want to take a look at MacroPy. You could borrow some code from it—and, maybe more importantly, learn how some of these features (that have no good examples in the docs) are used.

Or, if you're willing to settle for something less cool, you can just use MacroPy to build an "odict comprehension macro" and use that. (Note that MacroPy currently only works in Python 2.7, not 3.x.) You can't quite get o{…}, but you can get, say, od[{…}], which isn't too bad. Download od.py, realmain.py, and main.py, and run python main.py to see it working. The key is this code, which takes a DictionaryComp AST, converts it to an equivalent GeneratorExpr on key-value Tuples, and wraps it in a Call to collections.OrderedDict:

def od(tree, **kw):     pair = ast.Tuple(elts=[tree.key, tree.value])     gx = ast.GeneratorExp(elt=pair, generators=tree.generators)     odict = ast.Attribute(value=ast.Name(id='collections'),                            attr='OrderedDict')     call = ast.Call(func=odict, args=[gx], keywords=[])     return call 

A different alternative is, of course, to modify the Python interpreter.

I would suggest dropping the O{…} syntax idea for your first go, and just making normal dict comprehensions compile to odicts. The good news is, you don't really need to change the grammar (which is beyond hairy…), just any one of:

  • the bytecodes that dictcomps compile to,
  • the way the interpreter runs those bytecodes, or
  • the implementation of the PyDict type

The bad news, while all of those are a lot easier than changing the grammar, none of them can be done from an extension module. (Well, you can do the first one by doing basically the same thing you'd do from pure Python… and you can do any of them by hooking the .so/.dll/.dylib to patch in your own functions, but that's the exact same work as hacking on Python plus the extra work of hooking at runtime.)

If you want to hack on CPython source, the code you want is in Python/compile.c, Python/ceval.c, and Objects/dictobject.c, and the dev guide tells you how to find everything you need. But you might want to consider hacking on PyPy source instead, since it's mostly written in (a subset of) Python rather than C.


As a side note, your attempt wouldn't have worked even if everything were done at the Python language level. olddict, dict = dict, OrderedDict creates a binding named dict in your module's globals, which shadows the name in builtins, but doesn't replace it. You can replace things in builtins (well, Python doesn't guarantee this, but there are implementation/version-specific things-that-happen-to-work for every implementation/version I've tried…), but what you did isn't the way to do it.

like image 24
abarnert Avatar answered Sep 19 '22 00:09

abarnert