Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to group a list using a dictionary that has lists as values

I'm looking for a Pythonic way or more efficient way to solve this problem. I have a dictionary which has sets as values (duplicates allowed across keys). Given a list I must create a dictionary which maps each category to the element using the key from the master dictionary. I'll give an example to illustrate.

Master Dictionary

{
    "KeyA": ['Aron', 'Ranom Value', 'Abhishek'],
    "KeyB": ['Ball', 'Foo', 'Bar', 'Badge', 'Dog'],
    "KeyZ": ['Random Value', 'Foo', 'Bar']
}

Input

['Foo', 'Bar', 'Dog', 'Aron']

Output

{
    "KeyA": ['Aron'],
    "KeyB": ['Bar', 'Foo', 'Dog'],
    "KeyZ": ['Foo', 'Bar']
}

My Current Thoughts

Invert individual items in the sets as keys and then do a lookup.

{
     'Aron'         : ['KeyA'],
     'Foo'          : ['KeyB', 'KeyZ'],
     'Bar'          : ['KeyB', 'KeyZ'],
     'Random Value' : ['KeyA', 'KeyZ']
}

I'd initialise the inverted dictionary by going through every item in every set. Approx time to create such a dictionary is O(n). Look for an item in the list in the inverted dictionary so created. Say the value Bar. Create a new dictionary using the information 'Bar': ['KeyB', 'KeyZ']. Resultant dictionary would be {'KeyB': ['Bar'], 'KeyZ': ['Bar']}. For the next item I'd have to do some bookkeeping on the existing dictionary like key exists or not, if yes then append to the existing list and so on.

Use the in operator on the set mapped (check for membership) to every key

The master dictionary and input list are going to be pretty small most of the times. (less than 500 unique items in all the sets together). So I could check for membership in the set returned by each key and create a dictionary. This is obviously less efficient but works for most cases.

I have a few more manipulations that are similar to the example given above. I don't want to do manual bookkeeping for all of them because they are error prone and slower than inbuilt functions.

What I need?

  • Better approaches (faster algorithm)
  • Inbuilt functions in itertools because these are faster
  • 3rd Party Library
  • Some esoteric comprehensions that a normal Python user wouldn't think of?
like image 959
Abhirath Mahipal Avatar asked Dec 27 '17 06:12

Abhirath Mahipal


Video Answer


3 Answers

How about you convert the list to a set before you start converting? Set-lookups are faster than linear search in lists.

input_set = set(input)

Once you have it, you can use regular dict-comprehension, in my opinion:

output = {key: [x for x in value if x in input_set] for key, value in master_dict.items()}

Result:

output == {'KeyB': ['Foo', 'Bar', 'Dog'], 'KeyA': ['Aron'], 'KeyZ': ['Foo', 'Bar']}
like image 196
UltraInstinct Avatar answered Sep 22 '22 08:09

UltraInstinct


one way is using intersection in python as follows:

x={
    "KeyA": ['Aron', 'Ranom Value', 'Abhishek'],
    "KeyB": ['Ball', 'Foo', 'Bar', 'Badge', 'Dog'],
    "KeyZ": ['Random Value', 'Foo', 'Bar']
   }
items = ['Foo', 'Bar', 'Dog', 'Aron']

{k:  set(items).intersection(set(v)) for k, v in x.items()}
like image 25
Amir Naimi Avatar answered Sep 26 '22 08:09

Amir Naimi


How about with defaultdict and list comprehension.

from collections import defaultdict

result = defaultdict(list)

d = {
    "KeyA": ['Aron', 'Ranom Value', 'Abhishek'],
    "KeyB": ['Ball', 'Foo', 'Bar', 'Badge', 'Dog'],
    "KeyZ": ['Random Value', 'Foo', 'Bar']
   }
items = ['Foo', 'Bar', 'Dog', 'Aron']

[result[k].append(e) for k,v in d.items() for e in v if e in items]

print(result) # defaultdict(<type 'list'>, {'KeyB': ['Foo', 'Bar', 'Dog'], 'KeyA': ['Aron'], 'KeyZ': ['Foo', 'Bar']})

print(dict(result)) # {'KeyB': ['Foo', 'Bar', 'Dog'], 'KeyA': ['Aron'], 'KeyZ': ['Foo', 'Bar']}
like image 40
Vivek Harikrishnan Avatar answered Sep 23 '22 08:09

Vivek Harikrishnan