I've researched around for a while and haven't found a clue for matching the following pattern (I am also very new to regex, though), it looks either like
/abc/foo/bar(/*)
or
/abc/foo/bar/stop
So I want to match and capture the above string as /abc/foo/bar. Now "/stop" is an optional string that could be appended at the end of the pattern. The goal is to get the desired capture while ignoring "stop" if they present (and if "stop" exists multiple times stop at the first "stop"), while allow as many slashes in the middle as possible except the slash at the end of line.
If I simply do:
^(/.*[^/])/*$
Which is greedy in including all slashes until I strip off the possible last occurrence; but in order to accept the second case where I have an optional "/stop", I need to match in a non-greedy way until I find the first possible "/stop" and stop there.
How can I craft a single regex that matches both cases?
EDIT: Not sure if my previous example wasn't clear enough. Try to give more, say I want to match and capture "/abc/foo/bar" in all of the following strings:
/abc/foo/bar
/abc/foo/bar/
/abc/foo/bar///
/abc/foo/bar/stop
/abc/foo/bar/stop/foo/bar/stop/stop
/abc/foo/bar//stop
While it won't match any of the followings:
/abc/foo/bar/sto (will match the whole "/abc/foo/bar/sto" instead)
/abc/foo/bar/abc/foo/bar (it will catch "/abc/foo/bar/abc/foo/bar" instead)
Let me know if this is clear enough. Thanks!
Try this:
/^(?:\/+(?!$|(?:stop\/?))[^\/]+)*/
Regex101 Demo
Explanation:
This matches the start of the string (^
), followed by zero or more instances of the following pattern:
\/+
) that are not followed by the end of the string ($
) or by stop
, followed by[^\/]+
)Here's a Debuggex Demo with working unit tests.
EDIT: Here is an alternative, arguably simpler, regex:
/^.+?(?=\/*$|\/+stop\b)/
This matches one or more characters in a non-greedy manner, then stops when whatever is after the match is one of the following:
$
), possibly preceded by one or more slashes (\/*
)Here's a Regex101 demo of this option.
EDIT 2: If you'd like to test this, here's a simple JavaScript test that tests the second regex above against various test strings and logs the results to the console:
var re = /^.+?(?=\/*$|\/+stop\b)/,
test_strings = ["/abc/foo/bar",
"/abc/foo/bar/",
"/abc/foo/bar///",
"/abc/foo/bar/stop",
"/abc/foo/bar/stop/foo/bar/stop/stop",
"/abc/foo/bar//stop",
"/abc/foo/bar/sto",
"/abc/foo/bar/abc/foo/bar"];
for(var s = 0; s < test_strings.length; s++) {
console.log(test_strings[s].match(re)[0]);
}
/*
Results:
/abc/foo/bar
/abc/foo/bar
/abc/foo/bar
/abc/foo/bar
/abc/foo/bar
/abc/foo/bar
/abc/foo/bar/sto
/abc/foo/bar/abc/foo/bar
*/
You can try something like this:
^((?:/[^/]+)+?)(?:/+|/+stop(?:/.*)?)$
demo
and if atomic groups are available, you better write:
^((?:/[^/]+)+?)(?>/+$|/+stop(?:/.*)?)
demo
If lookaheads are available:
^/(?>[^/]+|/(?!/*(?:$|stop(?:/|$))))+
demo
ps: don't forget to escape slashes if your delimiters are slashes.
As Ed Cottrell notices it, features like atomic grouping are not available in language like Javascript or in the re module of Python. However, this feature can be efficiently emulated using the fact that a lookahead is naturaly atomic: (?>a+)
<=> (?=(a+))\1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With