Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Negative lookbehind alternative

Tags:

regex

go

I have a string

centenary

I'd like to match ten only when it is not preceded by cen.

So far I have this regex:

(([^c][^e][^n])|^)ten

That returns true in the following cases tenary, blahtenary and false for ctenary, cetenary, centanary

package main

import (
    "fmt"
    "regexp"
)

func main() {
    txt := "ctenary"
    rx := `(([^c][^e][^n])|^)ten`
    re := regexp.MustCompile(rx)
    m := re.MatchString(txt)
    fmt.Println(m)
}
like image 595
Kennedy Avatar asked Jun 24 '16 13:06

Kennedy


1 Answers

Due to the missing support for either lookahead or lookbehind, we need to stick to negated character classes - but [^c][^e][^n] doesn't fully cover it, as it would not allow cxxten and also not cover strings where there aren't 3 characters before ten.

I came up with (?:^|[^n]|(?:[^e]|^)n|(?:[^c]|^)en)ten, that stores ten into the first captured group. It's creating alternatives for each possible way to not exactly match cen.

An alternative might be matching (.{0,3})(ten) and discard the match programatically if the first group stores cen.

like image 129
Sebastian Proske Avatar answered Oct 23 '22 01:10

Sebastian Proske