Two types of URLs I want to match:
(1) www.test.de/type1/12345/this-is-a-title.html
(2) www.test.de/category/another-title-oh-yes.html
In the first type, I want to match "12345". In the second type I want to match "category/another-title-oh-yes".
Here is what I came up with:
(?:(?:\.de\/type1\/([\d]*)\/)|\.de\/([\S]+)\.html)
This returns the following:
For type (1):
Match group 1: 12345
Match group 2:
For type (2):
Match group:
Match group 2: category/another-title-oh-yes
As you can see, it is working pretty well already. For various reasons I need the regex to return only one match-group, though. Is there a way to achieve that?
Get both the matched group at index 1 using both Negative Lookahead and Positive Lookbehind.
((?<=\.de\/type1\/)\d+|(?<=\.de\/)(?!type1)[^\.]+)
There are two regex pattern that are ORed.
First regex pattern looks for 12345
Second regex pattern looks for category/another-title-oh-yes
.
Note:
Combine whole regex pattern inside the parenthesis (...|...)
and remove parenthesis from the [^\.]+
and \d+
where:
[^\.]+ find anything until dot is found
\d+ find one or more digits
Here is online demo on regex101
Input:
www.test.de/type1/12345/this-is-a-title.html
www.test.de/category/another-title-oh-yes.html
Output:
MATCH 1
1. [18-23] `12345`
MATCH 2
1. [57-86] `category/another-title-oh-yes`
try this one and get both the matched group at index 2.
((?:\.de\/type1\/)(\d+)|(?:\.de\/)(?!type1)([^\.]+))
Here is online demo on regex101.
Input:
www.test.de/type1/12345/this-is-a-title.html
www.test.de/category/another-title-oh-yes.html
Output:
MATCH 1
1. `.de/type1/12345`
2. `12345`
MATCH 2
1. `.de/category/another-title-oh-yes`
2. `category/another-title-oh-yes`
Maybe this:
^www\.test\.de/(type1/(.*)\.|(.*)\.html)$
Debuggex Demo
Then for example:
var str = "www.test.de/type1/12345/this-is-a-title.html"
var regex = /^www\.test\.de/(type1/(.*)\.|(.*)\.html)$/
console.log(str.match(regex))
This will output an array, the first element is the string, the second one is whatever is after the website address, the third is what matched according to type1 and the fourth element is the rest.
You can do something like var matches = str.match(regex); return matches[2] || matches[3];
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With