Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to create a dictionary from a list where the keys are the elements that are found in another list and values are elements between keys

Considering that I have two lists like:

l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = [
    'x','q','we','da','po',
    'a', 'el1', 'el2', 'el3', 'el4',
    'b', 'some_other_el_1', 'some_other_el_2',
    'c', 'another_element_1', 'another_element_2',
    'd', '', '', 'another_element_3', 'd4'
]

and I need to create a dictionary where the keys are those element from second list that are found in the first and values are lists of elements found between "keys" like:

result = {
    'a': ['el1', 'el2', 'el3', 'el4'],
    'b': ['some_other_el_1', 'some_other_el_2'],
    'c': ['another_element_1', 'another_element_2'],
    'd': ['', '', 'another_element_3', 'd4']
}

What's a more pythonic way to do this?

Currently I'm doing this :

# I'm not sure that the first element in the second list
# will also be in the first so I have to create a key
k = ''
d[k] = []
for x in l2:
    if x in l1:
        k = x
        d[k] = []
    else:
        d[k].append(x)

But I'm quite positive that this is not the best way to do it and it also doesn't looks nice :)

Edit: I also have to mention that no list is necessary ordered and neither the second list must start with an element from the first one.

like image 319
Cristian Harangus Avatar asked May 15 '18 12:05

Cristian Harangus


3 Answers

I don't think you'll do much better if this is the most specific statement of the problem. I mean I'd do it this way, but it's not much better.

import collections


d = collections.defaultdict(list)
s = set(l1)
k = ''

for x in l2:
    if x in s:
        k = x
    else:
        d[k].append(x)
like image 107
FHTMitchell Avatar answered Oct 12 '22 07:10

FHTMitchell


For fun, you can also do this with itertools and 3rd party numpy:

import numpy as np
from itertools import zip_longest, islice

arr = np.where(np.in1d(l2, l1))[0]
res = {l2[i]: l2[i+1: j] for i, j in zip_longest(arr, islice(arr, 1, None))}

print(res)

{'a': ['el1', 'el2', 'el3', 'el4'],
 'b': ['some_other_el_1', 'some_other_el_2'],
 'c': ['another_element_1', 'another_element_2'],
 'd': ['', '', 'another_element_3', 'd4']}
like image 39
jpp Avatar answered Oct 12 '22 08:10

jpp


Here is a version using itertools.groupby. It may or may not be more efficient than the plain version from your post, depending on how groupby is implemented, because the for loop has fewer iterations.

from itertools import groupby
from collections import defaultdict, deque

def group_by_keys(keys, values):
    """
    >>> sorted(group_by_keys('abcdef', [
    ...          1, 2, 3,
    ...     'b', 4, 5,
    ...     'd',
    ...     'a', 6, 7,
    ...     'c', 8, 9,
    ...     'a', 10, 11, 12
    ... ]).items())
    [('a', [6, 7, 10, 11, 12]), ('b', [4, 5]), ('c', [8, 9])]
    """
    keys = set(keys)
    result = defaultdict(list)
    current_key = None
    for is_key, items in groupby(values, key=lambda x: x in keys):
        if is_key:
            current_key = deque(items, maxlen=1).pop()  # last of items
        elif current_key is not None:
            result[current_key].extend(items)
    return result

This doesn't distinguish between keys that don't occur in values at all (like e and f), and keys for which there are no corresponding values (like d). If this information is needed, one of the other solutions might be better suited.

like image 30
mkrieger1 Avatar answered Oct 12 '22 09:10

mkrieger1