Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace words in a string using a dictionary mapping

I have the following sentence

a = "you don't need a dog"

and a dictionary

dict =  {"don't": "do not" }

But I can't use the dictionary to map words in the sentence using the below code:

''.join(str(dict.get(word, word)) for word in a)

Output:

"you don't need a dog"

What am I doing wrong?

like image 485
A.Papa Avatar asked Apr 01 '18 17:04

A.Papa


2 Answers

Here is one way.

a = "you don't need a dog"

d =  {"don't": "do not" }

res = ' '.join([d.get(i, i) for i in a.split()])

# 'you do not need a dog'

Explanation

  • Never name a variable after a class, e.g. use d instead of dict.
  • Use str.split to split by whitespace.
  • There is no need to wrap str around values which are already strings.
  • str.join works marginally better with a list comprehension versus a generator expression.
like image 169
jpp Avatar answered Sep 27 '22 22:09

jpp


All answers are correct, but in case your sentence is quite long and the mapping-dictionary rather small, you should think of iterating over the items (key-value pairs) of the dictionary and apply str.replace to the original sentence.

The code as suggested by the others. It takes 6.35 µs per loop.

%%timeit

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

search = ' '.join([mapping.get(i, i) for i in search.split()])

Let's try using str.replace instead. It takes 633 ns per loop.

%%timeit 

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

for key, value in mapping.items():
    search = search.replace(key, value)

And let's use Python3 list comprehension. So we get the fastest version that takes 1.09 µs per loop.

%%timeit 

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

search = [search.replace(key, value) for key, value in mapping.items()][0]

You see the difference? For your short sentence the first and the third code are about the same speed. But the longer the sentence (search string) gets, the more obvious the difference in performance is.

Result string is:

'you do not need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?'

Remark: str.replace would also replace occurrences within long concatenated words. One needs to ensure that replacement is done for full words only. I guess there are options for str.replace. Another idea is using regular expressions as explained in this posting as they also take care of lower and upper cases. Trailing white spaces in your lookup dictionary won’t work since you won’t find occurrences at the beginning or on the end of a sentence.

like image 37
Matthias Avatar answered Sep 27 '22 23:09

Matthias