Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing python list elements with key

Tags:

python

list

key

I have a list of non-unique strings:

list = ["a", "b", "c", "a", "a", "d", "b"]

I would like to replace each element with an integer key which uniquely identifies each string:

list = [0, 1, 2, 0, 0, 3, 1]

The number does not matter, as long as it is a unique identifier.

So far all I can think to do is copy the list to a set, and use the index of the set to reference the list. I'm sure there's a better way though.

like image 836
Rachie Avatar asked Nov 29 '22 23:11

Rachie


2 Answers

This will guarantee uniqueness and that the id's are contiguous starting from 0:

id_s = {c: i for i, c in enumerate(set(list))}
li = [id_s[c] for c in list]

On a different note, you should not use 'list' as variable name because it will shadow the built-in type list.

like image 123
user2390182 Avatar answered Dec 07 '22 23:12

user2390182


Here's a single pass solution with defaultdict:

from collections import defaultdict
seen = defaultdict()
seen.default_factory = lambda: len(seen)  # you could instead bind to seen.__len__

In [11]: [seen[c] for c in list]
Out[11]: [0, 1, 2, 0, 0, 3, 1]

It's kind of a trick but worth mentioning!


An alternative, suggested by @user2357112 in a related question/answer, is to increment with itertools.count. This allows you to do this just in the constructor:

from itertools import count
seen = defaultdict(count().__next__)  # .next in python 2

This may be preferable as the default_factory method won't look up seen in global scope.

like image 37
Andy Hayden Avatar answered Dec 08 '22 00:12

Andy Hayden