Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert list of objects to a list of integers and a lookup table

Tags:

python

lookup

To illustrate what I mean by this, here is an example

messages = [
  ('Ricky',  'Steve',  'SMS'),
  ('Steve',  'Karl',   'SMS'),
  ('Karl',   'Nora',   'Email')
]

I want to convert this list and a definition of groups to a list of integers and a lookup dictionary so that each element in the group gets a unique id. That id should map to the element in the lookup table like this

messages_int, lookup_table = create_lookup_list(
              messages, ('person', 'person', 'medium'))

print messages_int
[ (0, 1, 0),
  (1, 2, 0),
  (2, 3, 1) ]

print lookup_table
{ 'person': ['Ricky', 'Steve', 'Karl', 'Nora'],
  'medium': ['SMS', 'Email']
}

I wonder if there is an elegant and pythonic solution to this problem.

I am also open to better terminology than create_lookup_list etc

like image 511
Otto Allmendinger Avatar asked Jan 24 '23 05:01

Otto Allmendinger


2 Answers

defaultdict combined with the itertools.count().next method is a good way to assign identifiers to unique items. Here's an example of how to apply this in your case:

from itertools import count
from collections import defaultdict

def create_lookup_list(data, domains):
    domain_keys = defaultdict(lambda:defaultdict(count().next))
    out = []
    for row in data:
        out.append(tuple(domain_keys[dom][val] for val, dom in zip(row, domains)))
    lookup_table = dict((k, sorted(d, key=d.get)) for k, d in domain_keys.items())
    return out, lookup_table

Edit: note that count().next becomes count().__next__ or lambda: next(count()) in Python 3.

like image 65
Ants Aasma Avatar answered Jan 30 '23 22:01

Ants Aasma


Mine's about the same length and complexity:

import collections

def create_lookup_list(messages, labels):

    # Collect all the values
    lookup = collections.defaultdict(set)
    for msg in messages:
        for l, v in zip(labels, msg):
            lookup[l].add(v)

    # Make the value sets lists
    for k, v in lookup.items():
        lookup[k] = list(v)

    # Make the lookup_list
    lookup_list = []
    for msg in messages:
        lookup_list.append([lookup[l].index(v) for l, v in zip(labels, msg)])

    return lookup_list, lookup
like image 22
Ned Batchelder Avatar answered Jan 30 '23 22:01

Ned Batchelder