Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String Find/Replace Algorithm

I would like to be able to search a string for various words, when I find one, i want to split the string at that point into 3 parts (left, match, right), the matched text would be excluded, and the process would continue with the new string left+right.

Now, once i have all my matches done, i need to reverse the process by reinserting the matched words (or a replacement for them) at the point they were removed. I have never really found what i wanted in any of my searches, so I thought I would ask for input here on SO.

Please let me know if this question needs further description.

BTW - at the moment, i have a very poor algorithm that replaces matched text with a unique string token, and then replaces the tokens with the replacement text for the appropriate match after all the matches have been done.

This is the goal:

one two three four five six 

match "three" replace with foo (remember we found three, and where we found it)

one two four five six
       |
     three

match "two four" and prevent it from being matched by anything (edited for clarity)

one five six
   |
 two four 
       |
     three

at this point, you cannot match for example "one two"

all the matches have been found, now put their replacements back in (in reverse order)

one two four five six
       |
     three


one two foo four five six

What's the point? Preventing one match's replacement text from being matched by another pattern. (all the patterns are run at the same time and in the same order for every string that is processed)

I'm not sure the language matters, but I'm using Lua in this case.

I'll try rephrasing, I have a list of patterns i want to find in a given string, if I find one, I want to remove that part of the string so it isnt matched by anything else, but I want to keep track of where i found it so I can insert the replacement text there once I am done trying to match my list of patterns

Here's a related question:

Shell script - search and replace text in multiple files using a list of strings

like image 480
sylvanaar Avatar asked Oct 30 '09 19:10

sylvanaar


1 Answers

Your algorithm description is unclear. There's no exact rule where the extracted tokens should be re-inserted.

Here's an example:

  1. Find 'three' in 'one two three four five six'
  2. Choose one of these two to get 'foo bar' as result:

    a. replace 'one two' with 'foo' and 'four five six' with 'bar'

    b. replace 'one two four five six' with 'foo bar'

  3. Insert 'three' back in the step 2 resulting string 'foo bar'

At step 3 does 'three' goes before 'bar' or after it?

Once you've come up with clear rules for reinserting, you can easily implement the algorithm as a recursive method or as an iterative method with a replacements stack.

like image 77
Franci Penov Avatar answered Sep 21 '22 08:09

Franci Penov