Replacing string with id using dictionary in python

Question

I have a dictionary file that contains a word in each line.

titles-sorted.txt

 a&a    
 a&b    
 a&c_bus    
 a&e    
 a&f    
 a&m    
 ....

For each word, its line number is the word's id.

Then I have another file that contains a set of words separated by tab in each line.

a.txt

 a_15   a_15_highway_(sri_lanka)    a_15_motorway   a_15_motorway_(germany) a_15_road_(sri_lanka)

I'd like to replace all of the words by id if it exists in the dictionary, so that the output looks like,

    3454    2345    123   5436     322 ....

So I wrote such python code to do this:

 f = open("titles-sorted.txt")
 lines = f.readlines()
 titlemap = {}
 nr = 1
 for l in lines:
     l = l.replace("
", "")
     titlemap[l.lower()] = nr
     nr+=1

 fw = open("a.index", "w")
 f = open("a.txt")
 lines = f.readlines()
 for l in lines:
     tokens = l.split("	")
     if tokens[0] in titlemap.keys():
            fw.write(str(titlemap[tokens[0]]) + "	")
            for t in tokens[1:]:
                    if t in titlemap.keys():
                            fw.write(str(titlemap[t]) + "	")
            fw.write("
")

 fw.close()
 f.close()

But this code is ridiculously slow, so it makes me suspicious if I have done everything right.

Is this an efficient way to do this?

njzk2 · Accepted Answer

The write loop contains a lot of calls to write, which are usually inefficient. You can probably speed things up by writing only once per line (or once per file if the file is small enough)

tokens = l.split("	")
fw.write('	'.join(fw.write(str(titlemap[t])) for t in tokens if t in titlemap)
fw.write("
")

or even:

lines = []
for l in f:
    lines.append('	'.join(fw.write(str(titlemap[t])) for t in l.split('	') if t in titlemap)
fw.write('
'.join(lines))

Also, if your tokens are used more than once, you can save time by converting them to string when you read then:

titlemap = {l.strip().lower(): str(index) for index, l in enumerate(f, start=1)}

Replacing string with id using dictionary in python

Tags:

python

dictionary

pandagrammer

1 Answers

njzk2

Recent Activity

Donate For Us

Replacing string with id using dictionary in python

Tags:

python

dictionary

pandagrammer

1 Answers

njzk2

Related questions

Recent Activity

Donate For Us