Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing multiple patterns in a block of data

I need to find the most efficient way of matching multiple regular expressions on a single block of text. To give an example of what I need, consider a block of text:

"Hello World what a beautiful day"

I want to replace Hello with "Bye" and "World" with Universe. I can always do this in a loop ofcourse, using something like String.replace functions availiable in various languages.

However, I could have a huge block of text with multiple string patterns, that I need to match and replace.

I was wondering if I can use Regular Expressions to do this efficiently or do I have to use a Parser like LALR.

I need to do this in JavaScript, so if anyone knows tools that can get it done, it would be appreciated.

like image 510
VikrantY Avatar asked Mar 23 '10 15:03

VikrantY


2 Answers

Edit

6 years after my original answer (below) I would solve this problem differently

function mreplace (replacements, str) {
  let result = str;
  for (let [x, y] of replacements)
    result = result.replace(x, y);
  return result;
}

let input = 'Hello World what a beautiful day';

let output = mreplace ([
  [/Hello/, 'Bye'],
  [/World/, 'Universe']
], input);

console.log(output);
// "Bye Universe what a beautiful day"

This has as tremendous advantage over the previous answer which required you to write each match twice. It also gives you individual control over each match. For example:

function mreplace (replacements, str) {
  let result = str;
  for (let [x, y] of replacements)
    result = result.replace(x, y);
  return result;
}

let input = 'Hello World what a beautiful day';

let output = mreplace ([
  //replace static strings
  ['day', 'night'],
  // use regexp and flags where you want them: replace all vowels with nothing
  [/[aeiou]/g, ''],
  // use captures and callbacks! replace first capital letter with lowercase 
  [/([A-Z])/, $0 => $0.toLowerCase()]

], input);

console.log(output);
// "hll Wrld wht  btfl nght"

Original answer

Andy E's answer can be modified to make adding replacement definitions easier.

var text = "Hello World what a beautiful day";
text.replace(/(Hello|World)/g, function ($0){
  var index = {
    'Hello': 'Bye',
    'World': 'Universe'
  };
  return index[$0] != undefined ? index[$0] : $0;
});

// "Bye Universe what a beautiful day";
like image 109
maček Avatar answered Sep 30 '22 17:09

maček


You can pass a function to replace:

var hello = "Hello World what a beautiful day";
hello.replace(/Hello|World/g, function ($0, $1, $2) // $3, $4... $n for captures
{
    if ($0 == "Hello")
        return "Bye";
    else if ($0 == "World")
        return "Universe";
});

// Output: "Bye Universe what a beautiful day";
like image 22
Andy E Avatar answered Sep 30 '22 18:09

Andy E