Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A faster way of doing multiple string replacements

Tags:

I need to do the following:

    static string[] pats = { "å", "Å", "æ", "Æ", "ä", "Ä", "ö", "Ö", "ø", "Ø" ,"è", "È", "à", "À", "ì", "Ì", "õ", "Õ", "ï", "Ï" };     static string[] repl = { "a", "A", "a", "A", "a", "A", "o", "O", "o", "O", "e", "E", "a", "A", "i", "I", "o", "O", "i", "I" };     static int i = pats.Length;     int j;       // function for the replacement(s)      public string DoRepl(string Inp) {       string tmp = Inp;         for( j = 0; j < i; j++ ) {             tmp = Regex.Replace(tmp,pats[j],repl[j]);         }         return tmp.ToString();                 }     /* Main flow processes about 45000 lines of input */ 

Each line has 6 elements that go through DoRepl. Approximately 300,000 function calls. Each does 20 Regex.Replace, totalling ~6 million replaces.

Is there any more elegant way to do this in fewer passes?

like image 406
cairnz Avatar asked Nov 11 '10 14:11

cairnz


People also ask

How do I replace multiple items in a string?

Replace multiple Substrings in a String using replace() It replaces all the occurrences of a sub-string to a new string by passing the old and new strings as parameters to the replace() function. Multiple calls to replace() function are required to replace multiple substrings in a string.

Which method can be used to replace parts of a string?

Python String replace() Method The replace() method replaces a specified phrase with another specified phrase.

How do you replace multiple strings in Python?

Replace multiple different substrings There is no method to replace multiple different strings with different ones, but you can apply replace() repeatedly. It just calls replace() in order, so if the first new contains the following old , the first new is also replaced.

How do you replace multiple occurrences of a string in Java?

You can replace all occurrence of a single character, or a substring of a given String in Java using the replaceAll() method of java. lang. String class. This method also allows you to specify the target substring using the regular expression, which means you can use this to remove all white space from String.


2 Answers

static Dictionary<char, char> repl = new Dictionary<char, char>() { { 'å', 'a' }, { 'ø', 'o' } }; // etc... public string DoRepl(string Inp) {     var tmp = Inp.Select(c =>     {         char r;         if (repl.TryGetValue(c, out r))             return r;         return c;     });     return new string(tmp.ToArray()); } 

Each char is checked only once against a dictionary and replaced if found in the dictionary.

like image 148
Jesper Larsen-Ledet Avatar answered Oct 21 '22 16:10

Jesper Larsen-Ledet


How about this "trick"?

string conv = Encoding.ASCII.GetString(Encoding.GetEncoding("Cyrillic").GetBytes(input)); 
like image 27
Jonas Elfström Avatar answered Oct 21 '22 16:10

Jonas Elfström