Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing strings in Go

I'm trying to find the begin of a named capturing groups in a string to create a simple parser (see related question). To do this the extract function remembers the last for characters in the last4 variable. If the last 4 characters are equal to "(?P<" it is the beginning of a capturing group:

package main

import "fmt"

const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`

func main() {
    extract(sample)
}

func extract(regex string) {
    last4 := new([4]int32)
    for _, c := range regex {
        last4[0], last4[1], last4[2], last4[3] = last4[1], last4[2], last4[3], c
        last4String := fmt.Sprintf("%c%c%c%c\n", last4[0], last4[1], last4[2], last4[3])
        if last4String == "(?P<" {
            fmt.Print("start of capturing group")
        }
    }
}

http://play.golang.org/p/pqA-wCuvux

But this code prints nothing! last4String == "(?P<" is never true, although this substrin appears in the output if I print last4String inside the loop. How to compare strings in Go then?

And is there a more elegant way to convert an int32 array to a string than fmt.Sprintf("%c%c%c%c\n", last4[0], last4[1], last4[2], last4[3])?

Anything else that could be better? My code looks somewhat inelegant to me.

like image 524
deamon Avatar asked Nov 11 '12 21:11

deamon


1 Answers

If it's not for self-education or similar, you probably want to use the existing RE parser in the standard library and then "walk" the AST to do whatever required.

func Parse(s string, flags Flags) (*Regexp, error)

Parse parses a regular expression string s, controlled by the specified Flags, and returns a regular expression parse tree. The syntax is described in the top-level comment for package regexp.

There's even a helper for your task.

EDIT1: Your code repaired:

package main

import "fmt"

const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`

func main() {
        extract(sample)
}

func extract(regex string) {
        var last4 [4]int32
        for _, c := range regex {
                last4[0], last4[1], last4[2], last4[3] = last4[1], last4[2], last4[3], c
                last4String := fmt.Sprintf("%c%c%c%c", last4[0], last4[1], last4[2], last4[3])
                if last4String == "(?P<" {
                    fmt.Println("start of capturing group")
                }
        }
}

(Also here)

EDIT2: Your code rewritten:

package main

import (
        "fmt"
        "strings"
)

const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`

func main() {
        extract(sample)
}

func extract(regex string) {
        start := 0
        for {
                i := strings.Index(regex[start:], "(?P<")
                if i < 0 {
                        break
                }

                fmt.Printf("start of capturing group @ %d\n", start+i)
                start += i + 1
        }
}

(Also here)

like image 65
zzzz Avatar answered Nov 05 '22 01:11

zzzz