Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary to map strings in list to numbers in Python

Tags:

python

list

I have a list of strings with some repeated. e.g. (not the actual list)

["hello", "goodbye", "hi", "how are you", "hi"]

I want to create a list of integers where each integer corresponds to a string. e.g. for the example above

[0, 1, 2, 3, 2]

where 0 = "hello", 1 = "goodbye" etc.

I looked at the example here: Convert a list of integer to a list of predefined strings in Python

I want to do basically the same thing but the other way around, strings to integers. That part shouldn't be too hard.

However, they seem to just create the dictionary in their code like this:

trans = {0: 'abc', 1: 'f', 2: 'z'}

Creating the dictionary yourself is fine when you know the exact contents of your list. My list of strings is extremely long and I don't know what the strings are as it comes from input. So I'd need to make the dictionary from my list of string some other way, like maybe a for loop.

I can't figure out how to make a dictionary that will map the strings in my list to numbers. I looked up how to make a dictionary with list comprehensions but I couldn't figure out how it deals with duplicates.

In other words, I'd like to know how to go through a list like my list of strings above and create a dictionary like:

{"hello": 0, "goodbye": 1, "hi": 2, "how are you": 3}

EDIT: I've had a lot of answers, thanks everyone for all your help. What I am now confused about is all the different ways of doing this. There have been a lot of suggestions, using enumerate(), set() and other functions. There was also one answer (@ChristianIacobs) that did it very simply with just a for loop. What I am wondering is whether there is any reason to use one of the slightly less simple answers? For instance, are they faster, or are there some situations where they are the only way that works?

like image 757
IceWarrior42 Avatar asked Mar 03 '23 21:03

IceWarrior42


1 Answers

To create a dictionary from your list you first need to get rid of duplicate values. Use a set to achieve that:

my_list = ["hello", "goodbye", "hi", "how are you", "hi"]
unique_list = list(set(my_list))

['hi', 'hello', 'goodbye', 'how are you']

Now you can create your dictionary by zipping the unique_list with a range of numbers:

my_dict = dict(zip(unique_list, range(len(unique_list))))

{'hi': 0, 'hello': 1, 'goodbye': 2, 'how are you': 3}
like image 160
Peter Avatar answered Mar 12 '23 11:03

Peter