Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Map elements to list of unique indexes

Suppose I have a list of elements:

my_list = ['CatA', 'CatB', 'CatC', 'CatA', 'CatA', 'CatC']

and I want to convert this list to a list of indexes of unique elements.

So CatA is assigned to index 0, CatB to index 1 and CatC to index 2.

My desired result would be:

result = [0, 1, 2, 0, 0, 2]

Currently I'm doing this by creating a dictionary that assigns to each element it's unique id and then using a list comprehension to create the final list of indexes:

unique_classes = np.unique(my_list)
conversion_dict = dict(unique_classes, range(len(unique_classes))
result = [conversion_dict[i] for i in my_list]

My question is: Is there an easier and straightforward way of doing this?

I am thinking about having a big list of categories so it needs to be efficient but preventing me to manually create the unique list, the dictionary and the list comprehension.


2 Answers

As suggested by @mikey, you can use np.unique, as below:

import numpy as np

my_list = ['CatA', 'CatB', 'CatC', 'CatA', 'CatA', 'CatC']

res = np.unique(my_list, return_inverse=True)[1]

Result:

[0 1 2 0 0 2]
like image 151
jpp Avatar answered Sep 07 '25 21:09

jpp


This will do the trick:

my_list = ['CatA', 'CatB', 'CatC', 'CatA', 'CatA', 'CatC']
first_occurances = dict()
result = []

for i, v in enumerate(my_list):
    try:
        index = first_occurances[v]
    except KeyError:
        index = i
        first_occurances[v] = i
    result.append(index)

Complexity will be O(n).

Basically what you do is storing in dict indexes of first value occurance. If first_occurances don't have value v, then we save current index i.

like image 32
vishes_shell Avatar answered Sep 07 '25 21:09

vishes_shell