Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is str.replace(..).replace(..) ad nauseam a standard idiom in Python?

For instance, say I wanted a function to escape a string for use in HTML (as in Django's escape filter):

    def escape(string):         """         Returns the given string with ampersands, quotes and angle          brackets encoded.         """         return string.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;').replace("'", '&#39;').replace('"', '&quot;') 

This works, but it gets ugly quickly and appears to have poor algorithmic performance (in this example, the string is repeatedly traversed 5 times). What would be better is something like this:

    def escape(string):         """         Returns the given string with ampersands, quotes and angle          brackets encoded.         """         # Note that ampersands must be escaped first; the rest can be escaped in          # any order.         return replace_multi(string.replace('&', '&amp;'),                              {'<': '&lt;', '>': '&gt;',                                "'": '&#39;', '"': '&quot;'}) 

Does such a function exist, or is the standard Python idiom to use what I wrote before?

like image 637
Michael Avatar asked Mar 20 '10 18:03

Michael


People also ask

What does str replace do in Python?

The replace() method returns a copy of the string where the old substring is replaced with the new substring.

Is the a replace string method in Python?

What is the string replace() method in Python? The replace() method is a built-in functionality offered in Python programming. It replaces all the occurrences of the old substring with the new substring. Replace() returns a new string in which old substring is replaced with the new substring.

What to use instead of replace in Python?

Replace by regex: re.sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third. As with replace() , you can specify the maximum count of replacements in the fourth parameter, count .

What does .replace mean Python?

The replace() method replaces a specified phrase with another specified phrase. Note: All occurrences of the specified phrase will be replaced, if nothing else is specified.


1 Answers

Do you have an application that is running too slow and you profiled it to find that a line like this snippet is causing it to be slow? Bottlenecks occur at unexpected places.

The current snippet traverses the string 5 times, doing one thing each time. You are suggesting traversing it once, probably doing doing five things each time (or at least doing something each time). It isn't clear that this will automatically do a better job to me. Currently the algorithm used is O(n*m) (assuming the length of the string is longer than the stuff in the rules), where n is the length of the string and m is the number of substitution rules. You could, I think, reduce the algorithmic complexity to something like O(n*log(m)) and in the specific case we're in—where the original things are all only one character (but not in the case of multiple calls to replace in general)—O(n), but this doesn't matter since m is 5 but n is unbounded.

If m is held constant, then, the complexity of both solutions really goes to O(n). It is not clear to me that it is going to be a worthy task to try to turn five simple passes into one complex one, the actual time of which I cannot guess at the current moment. If there was something about it that could make it scale better, I would have thought it was much more worthwhile task.

Doing everything on one pass rather than consecutive passes also demands questions be answered about what to do about conflicting rules and how they are applied. The resolution to these questions is clear with a chain of replace.

like image 168
Mike Graham Avatar answered Oct 20 '22 09:10

Mike Graham