Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace a regular expression submatch using a function

Tags:

regex

go

Let's say I have strings like

input := `bla bla b:foo="hop" blablabla b:bar="hu?"`

and I want to replace the parts between quotes in b:foo="hop" or b:bar="hu?" using a function.

It's easy to build a regular expression to get the match and submatch, for example

r := regexp.MustCompile(`\bb:\w+="([^"]+)"`)

and then to call ReplaceAllStringFunc but the problem is that the callback receives the whole match and not the submatch :

fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
    // m is the whole match here. Damn.
}))

How can I replace the submatch ?

Right now, I haven't found a better solution than to decompose myself m inside the callback with a regex, and to rebuild the string after having processed the submatch.

I would have used an alternate approach with a positive look behind were they available in Go but that's not the case (and they shouldn't be necessary anyway).

What can I do here?


EDIT : here's my current solution that I would like to simplify :

func complexFunc(s string) string {
   return "dbvalue("+s+")" // this could be more complex
}
func main() {
        input := `bla bla b:foo="hop" blablabla b:bar="hu?"`
        r := regexp.MustCompile(`(\bb:\w+=")([^"]+)`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                parts := r.FindStringSubmatch(m)
                return parts[1] + complexFunc(parts[2])
        }))
}

(playground link)

What bothers me is that I have to apply the regex twice. This doesn't sound right.

like image 474
Denys Séguret Avatar asked Jun 12 '13 12:06

Denys Séguret


People also ask

How do you replace re subs?

If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.

What is $1 in regex replace?

For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.

How do you replace a pattern in Python?

To replace a string in Python, the regex sub() method is used. It is a built-in Python method in re module that returns replaced string. Don't forget to import the re module. This method searches the pattern in the string and then replace it with a new given expression.

What is replace regex?

The Regex. Replace(String, String, MatchEvaluator, RegexOptions) method is useful for replacing a regular expression match in if any of the following conditions is true: The replacement string cannot readily be specified by a regular expression replacement pattern.


Video Answer


2 Answers

I don't like the code bellow, but it seems to do what you seem to want it to do:

package main

import (
        "fmt"
        "regexp"
)

func main() {
        input := `bla bla b:foo="hop" blablabla b:bar="hu?"`
        r := regexp.MustCompile(`\bb:\w+="([^"]+)"`)
        r2 := regexp.MustCompile(`"([^"]+)"`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                return r2.ReplaceAllString(m, `"${2}whatever"`)
        }))
}

Playground


Output

bla bla b:foo="whatever" blablabla b:bar="whatever"

EDIT: Take II.


package main

import (
        "fmt"
        "regexp"
)

func computedFrom(s string) string {
        return fmt.Sprintf("computedFrom(%s)", s)
}

func main() {
        input := `bla bla b:foo="hop" blablabla b:bar="hu?"`
        r := regexp.MustCompile(`\bb:\w+="([^"]+)"`)
        r2 := regexp.MustCompile(`"([^"]+)"`)
        fmt.Println(r.ReplaceAllStringFunc(input, func(m string) string {
                match := string(r2.Find([]byte(m)))
                return r2.ReplaceAllString(m, computedFrom(match))
        }))
}

Playground


Output:

bla bla b:foo=computedFrom("hop") blablabla b:bar=computedFrom("hu?")
like image 185
zzzz Avatar answered Oct 09 '22 18:10

zzzz


It looks like the OP created an issue for this, but as of this post, still isn't implemented https://github.com/golang/go/issues/5690

Fortunately, it looks like someone else on the web has provided their own function that does this https://gist.github.com/elliotchance/d419395aa776d632d897

func ReplaceAllStringSubmatchFunc(re *regexp.Regexp, str string, repl func([]string) string) string {
    result := ""
    lastIndex := 0

    for _, v := range re.FindAllSubmatchIndex([]byte(str), -1) {
        groups := []string{}
        for i := 0; i < len(v); i += 2 {
            groups = append(groups, str[v[i]:v[i+1]])
        }

        result += str[lastIndex:v[0]] + repl(groups)
        lastIndex = v[1]
    }

    return result + str[lastIndex:]
}
like image 2
Brian Leishman Avatar answered Oct 09 '22 16:10

Brian Leishman