I have a lot (>100,000) lowercase strings in a list, where a subset might look like this:
str_list = ["hello i am from denmark", "that was in the united states", "nothing here"]
I further have a dict like this (in reality this is going to have a length of around ~1000):
dict_x = {"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"}
For all strings in the list which contain any of the dict's keys, I want to replace the entire string with the corresponding dict value. The expected result should thus be:
str_list = ["dk", "us", "nothing here"]
What is the most efficient way to do this given the number of strings I have and the length of the dict?
Extra info: There is never more than one dict key in a string.
This seems to be a good way:
input_strings = ["hello i am from denmark",
"that was in the united states",
"nothing here"]
dict_x = {"denmark" : "dk", "germany" : "ger", "norway" : "no", "united states" : "us"}
output_strings = []
for string in input_strings:
for key, value in dict_x.items():
if key in string:
output_strings.append(value)
break
else:
output_strings.append(string)
print(output_strings)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With