Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign a number to each unique value in a list

Tags:

python

list

I have a list of strings. I want to assign a unique number to each string (the exact number is not important), and create a list of the same length using these numbers, in order. Below is my best attempt at it, but I am not happy for two reasons:

  1. It assumes that the same values are next to each other

  2. I had to start the list with a 0, otherwise the output would be incorrect

My code:

names = ['ll', 'll', 'll', 'hl', 'hl', 'hl', 'LL', 'LL', 'LL', 'HL', 'HL', 'HL']
numbers = [0]
num = 0
for item in range(len(names)):
    if item == len(names) - 1:
      break
    elif names[item] == names[item+1]:
        numbers.append(num)
    else:
        num = num + 1
        numbers.append(num)
print(numbers)

I want to make the code more generic, so it will work with an unknown list. Any ideas?

like image 836
millsy Avatar asked Feb 20 '17 16:02

millsy


People also ask

How do you assign a unique number in Python?

With enumerate and set The enumerate function assigns unique ids to each element. But if the list already as duplicate elements then we need to create a dictionary of key value pairs form the list and assign unique values using the set function.

How do you assign a value to a list in Python?

insert(index, elem) -- inserts the element at the given index, shifting elements to the right. list. extend(list2) adds the elements in list2 to the end of the list. Using + or += on a list is similar to using extend().

How do you assign a number to a string in Python?

To convert an integer to string in Python, use the str() function. This function takes any data type and converts it into a string, including integers. Use the syntax print(str(INT)) to return the int as a str , or string.


Video Answer


1 Answers

Without using an external library (check the EDIT for a Pandas solution) you can do it as follows :

d = {ni: indi for indi, ni in enumerate(set(names))}
numbers = [d[ni] for ni in names]

Brief explanation:

In the first line, you assign a number to each unique element in your list (stored in the dictionary d; you can easily create it using a dictionary comprehension; set returns the unique elements of names).

Then, in the second line, you do a list comprehension and store the actual numbers in the list numbers.

One example to illustrate that it also works fine for unsorted lists:

# 'll' appears all over the place
names = ['ll', 'll', 'hl', 'hl', 'hl', 'LL', 'LL', 'll', 'LL', 'HL', 'HL', 'HL', 'll']

That is the output for numbers:

[1, 1, 3, 3, 3, 2, 2, 1, 2, 0, 0, 0, 1]

As you can see, the number 1 associated with ll appears at the correct places.

EDIT

If you have Pandas available, you can also use pandas.factorize (which seems to be quite efficient for huge lists and also works fine for lists of tuples as explained here):

import pandas as pd

pd.factorize(names)

will then return

(array([(array([0, 0, 1, 1, 1, 2, 2, 0, 2, 3, 3, 3, 0]),
 array(['ll', 'hl', 'LL', 'HL'], dtype=object))

Therefore,

numbers = pd.factorize(names)[0]
like image 171
Cleb Avatar answered Sep 17 '22 13:09

Cleb