I'm trying to write a regex that can extract a command, here's what I've got so far using a negative lookbehind assertion:
\b(?<![@#\/])\w.*
So with the input:
/msg @nickname #channel foo bar baz
/foo #channel @nickname foo bar baz
foo bar baz
foo bar baz
is extracted every time. See working example
https://regex101.com/r/lF9aG7/3
In Go however this doesn't compile http://play.golang.org/p/gkkVZgScS_
It throws:
panic: regexp: Compile(`\b(?<![@#\/])\w.*`): error parsing regexp: invalid or unsupported Perl syntax: `(?<`
I did a bit of research and realized negative lookbehinds are not supported in the language to guarantee O(n) time.
How can I rewrite this regex so that it does the same without negative lookbehind?
Positive and Negative Lookbehind Lookbehind has the same effect, but works backwards. It tells the regex engine to temporarily step backwards in the string, to check if the text inside the lookbehind can be matched there. (?<!a)b matches a “b” that is not preceded by an “a”, using negative lookbehind.
Negative Lookbehind Syntax:Where match is the item to match and element is the character, characters or group in regex which must not precede the match, to declare it a successful match. So if you want to avoid matching a token if a certain token precedes it you may use negative lookbehind. For example / (? <!
Negative lookahead, which is what you're after, requires a more powerful tool than the standard grep . You need a PCRE-enabled grep. If you have GNU grep , the current version supports options -P or --perl-regexp and you can then use the regex you wanted.
In this type of lookahead the regex engine searches for a particular element which may be a character or characters or a group after the item matched. If that particular element is not present then the regex declares the match as a match otherwise it simply rejects that match.
Since in your negated lookbehind, you are only using a simple character set; you can replace it with a negated character-set:
\b[^@#/]\w.*
If the are allowed at the start of the string, then use the ^
anchor:
(?:^|[^@#\/])\b\w.*
Based on the samples in Go playground link in your question, I think you're looking to filter out all words beginning with a character from [#@/]
. You can use a filter
function:
func Filter(vs []string, f func(string) bool) []string {
vsf := make([]string, 0)
for _, v := range vs {
if f(v) {
vsf = append(vsf, v)
}
}
return vsf
}
and a Process
function, which makes use of the filter above:
func Process(inp string) string {
t := strings.Split(inp, " ")
t = Filter(t, func(x string) bool {
return strings.Index(x, "#") != 0 &&
strings.Index(x, "@") != 0 &&
strings.Index(x, "/") != 0
})
return strings.Join(t, " ")
}
It can be seen in action on playground at http://play.golang.org/p/ntJRNxJTxo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With