I would like to be able to search a string for various words, when I find one, i want to split the string at that point into 3 parts (left, match, right), the matched text would be excluded, and the process would continue with the new string left+right.
Now, once i have all my matches done, i need to reverse the process by reinserting the matched words (or a replacement for them) at the point they were removed. I have never really found what i wanted in any of my searches, so I thought I would ask for input here on SO.
Please let me know if this question needs further description.
BTW - at the moment, i have a very poor algorithm that replaces matched text with a unique string token, and then replaces the tokens with the replacement text for the appropriate match after all the matches have been done.
This is the goal:
one two three four five six
match "three" replace with foo (remember we found three, and where we found it)
one two four five six
|
three
match "two four" and prevent it from being matched by anything (edited for clarity)
one five six
|
two four
|
three
at this point, you cannot match for example "one two"
all the matches have been found, now put their replacements back in (in reverse order)
one two four five six
|
three
one two foo four five six
What's the point? Preventing one match's replacement text from being matched by another pattern. (all the patterns are run at the same time and in the same order for every string that is processed)
I'm not sure the language matters, but I'm using Lua in this case.
I'll try rephrasing, I have a list of patterns i want to find in a given string, if I find one, I want to remove that part of the string so it isnt matched by anything else, but I want to keep track of where i found it so I can insert the replacement text there once I am done trying to match my list of patterns
Here's a related question:
Shell script - search and replace text in multiple files using a list of strings
Your algorithm description is unclear. There's no exact rule where the extracted tokens should be re-inserted.
Here's an example:
Choose one of these two to get 'foo bar' as result:
a. replace 'one two' with 'foo' and 'four five six' with 'bar'
b. replace 'one two four five six' with 'foo bar'
Insert 'three' back in the step 2 resulting string 'foo bar'
At step 3 does 'three' goes before 'bar' or after it?
Once you've come up with clear rules for reinserting, you can easily implement the algorithm as a recursive method or as an iterative method with a replacements stack.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With