Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nest a flat list based on an arbitrary criterion

Tags:

python

list

I have a flat list of unique objects, some of which may share a given attribute with others. I wish to create a nested list-of-lists, with objects grouped by the given attribute. As a minimal example, given the following list:

>>> flat = ["Shoes", "pants", "shirt", "tie", "jacket", "hat"]

I might want to group it by length, eg:

>>> nest_by_length(flat)
[['tie', 'hat'], ['shoes', 'pants', 'shirt'], ['jacket']]

I've seen a couple of similar questions and suggestions. However, in all of these cases, the nesting is based on the ordering of the input list. In my case, the ordering of the input list is completely unpredictable, as is the number of sub-lists for the output and the number of items per sub-list.

Is there a standard function or idiomatic way to accomplish this?

like image 452
Joe Avatar asked May 12 '15 17:05

Joe


1 Answers

A common idiom for an existing list is to use groupby in itertools:

from itertools import groupby

flat = ["Shoes", "pants", "shirt", "tie", "jacket", "hat"]

result=[]
for k, g in groupby(sorted(flat, key=len), key=len):
    result.append(list(g))

print result   

Or, more tersely:

[list(g) for _,g in groupby(sorted(flat, key=len), key=len)]

Prints:

[['tie', 'hat'], ['Shoes', 'pants', 'shirt'], ['jacket']]

Input to groupby is grouped into groups based on the changing value of output of the key function, in this case len. Generally, you need to preorder the list based on the same key function, so the sorted function is called first.

If your source list is not complete yet, or not sortable based on the criteria (or you would just prefer another option), create a dict that maps your criteria to a unique key value:

groups={}
for e in flat:
    groups.setdefault(len(e), []).append(e)

print groups    
# {5: ['Shoes', 'pants', 'shirt'], 3: ['tie', 'hat'], 6: ['jacket']}

You can also use defaultdict rather than setdefault with the arbitrary key value:

from collections import defaultdict
groups=defaultdict(list)
for e in flat:
    groups[len(e)].append(e)  
# groups=defaultdict(<type 'list'>, {5: ['Shoes', 'pants', 'shirt'], 3: ['tie', 'hat'], 6: ['jacket']})

In either case, then you can create the nested list from that:

>>> [groups[k] for k in sorted(groups.keys())] 
[['tie', 'hat'], ['Shoes', 'pants', 'shirt'], ['jacket']]
like image 53
dawg Avatar answered Nov 12 '22 13:11

dawg