Fastest way to substitute a set of characters in a string

Question

I'm working with a string of bytes (which can be anywhere between 10kb and 3MB) and I need to filter out approximately 16 bytes (replacing them with other bytes)

At the moment I have a function a bit like this..

BYTE_REPLACE = {
  52: 7, # first number is the byte I want to replace
  53: 12, # while the second number is the byte I want to replace it WITH
}
def filter(st):
  for b in BYTE_REPLACE:
    st = st.replace(chr(b),chr(BYTE_REPLACE[b]))
  return st

(Byte list paraphrased for the sake of this question)

Using map resulted in an execution time of ~.33s, while this results in a 10x faster time of ~.03s (Both performed on a HUGE string, larger than 1.5MB compressed).

While any performance gains would be considerably negligible, is there a better way of doing this?

(I am aware that it would be much more optimal to store the filtered string. This isn't an option, though. I'm fooling with a Minecraft Classic server's level format and have to filter out bytes that certain clients don't support)

falsetru · Accepted Answer

Use str.translate:

Python 3.x

def subs(st):
    return st.translate(BYTE_REPLACE)

Example usage:

>>> subs('4567')
'\x07\x0c67'

Python 2.x

str.translate (Python 2)

import string
k, v = zip(*BYTE_REPLACE.iteritems())
k, v = ''.join(map(chr, k)), ''.join(map(chr, v))
tbl = string.maketrans(k, v)
def subs(st):
    return st.translate(tbl)

Tim Peters · Answer

Look up the translate() method on strings. That allows you to do any number of 1-byte transformations in a single pass over the string. Use the string.maketrans() function to build the translation table. If you usually have 16 pairs, this should run about 16 times faster than doing 1-byte replacements 16 times.

Fastest way to substitute a set of characters in a string

Tags:

performance

python

string

replace

MoJi

2 Answers

Python 3.x

Python 2.x

falsetru

Tim Peters

Recent Activity

Donate For Us

Fastest way to substitute a set of characters in a string

Tags:

performance

python

string

replace

MoJi

2 Answers

Python 3.x

Python 2.x

falsetru

Tim Peters

Related questions

Recent Activity

Donate For Us