Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to match a regex with backreference in Go?

I need to match a regex that uses backreferences (e.g. \1) in my Go code.

That's not so easy because in Go, the official regexp package uses the RE2 engine, one that have chosen to not support backreferences (and some other lesser-known features) so that there can be a guarantee of linear-time execution, therefore avoiding regex denial-of-service attacks. Enabling backreferences support is not an option with RE2.

In my code, there is no risk of malicious exploitation by attackers, and I need backreferences.

What should I do?

like image 858
Eldritch Conundrum Avatar asked May 31 '14 10:05

Eldritch Conundrum


2 Answers

regexp package funcs FindSubmatchIndex and Expand can capture content by backreferences. It isn't very convenient, but it is still possible. Example

package main

import (
    "fmt"
    "regexp"
)

func main() {
    content := []byte(`
    # comment line
    option1: value1
    option2: value2

    # another comment line
    option3: value3
`)

    pattern := regexp.MustCompile(`(?m)(?P<key>\w+):\s+(?P<value>\w+)$`)

    template := []byte("$key=$value\n")
    result := []byte{}
    for _, submatches := range pattern.FindAllSubmatchIndex(content, -1) {
        result = pattern.Expand(result, template, content, submatches)
    }
    fmt.Println(string(result))
}

output

option1=value1
option2=value2
option3=value3

like image 101
Vladimir Filin Avatar answered Sep 28 '22 03:09

Vladimir Filin


Answering my own question here, I solved this using golang-pkg-pcre, it uses libpcre++, perl regexes that do support backreferences. The API is not the same.

like image 30
Eldritch Conundrum Avatar answered Sep 28 '22 04:09

Eldritch Conundrum