I have a URL, and I'm trying to match it to a regular expression to pull out some groups. The problem I'm having is that the URL can either end or continue with a "/" and more URL text. I'd like to match URLs like this:
But not match something like this:
So, I thought my best bet was something like this:
/(.+)/(\d{4}-\d{2}-\d{2})-(\d+)[/$]
where the character class at the end contained either the "/" or the end-of-line. The character class doesn't seem to be happy with the "$" in there though. How can I best discriminate between these URLs while still pulling back the correct groups?
End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.
$ means "Match the end of the string" (the position after the last character in the string).
\n. Matches a newline character. \r. Matches a carriage return character.
To match either / or end of content, use (/|\z)
This only applies if you are not using multi-line matching (i.e. you're matching a single URL, not a newline-delimited list of URLs).
To put that with an updated version of what you had:
/(\S+?)/(\d{4}-\d{2}-\d{2})-(\d+)(/|\z)
Note that I've changed the start to be a non-greedy match for non-whitespace ( \S+?
) rather than matching anything and everything ( .*
)
You've got a couple regexes now which will do what you want, so that's adequately covered.
What hasn't been mentioned is why your attempt won't work: Inside a character class, $
(as well as ^
, .
, and /
) has no special meaning, so [/$]
matches either a literal /
or a literal $
rather than terminating the regex (/
) or matching end-of-line ($
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With