Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match URL end-of-line or "/" character

Tags:

regex

I have a URL, and I'm trying to match it to a regular expression to pull out some groups. The problem I'm having is that the URL can either end or continue with a "/" and more URL text. I'd like to match URLs like this:

  • http://server/xyz/2008-10-08-4
  • http://server/xyz/2008-10-08-4/
  • http://server/xyz/2008-10-08-4/123/more

But not match something like this:

  • http://server/xyz/2008-10-08-4-1

So, I thought my best bet was something like this:

/(.+)/(\d{4}-\d{2}-\d{2})-(\d+)[/$] 

where the character class at the end contained either the "/" or the end-of-line. The character class doesn't seem to be happy with the "$" in there though. How can I best discriminate between these URLs while still pulling back the correct groups?

like image 559
Chris Farmer Avatar asked Oct 06 '08 16:10

Chris Farmer


People also ask

Which RegEx matches the end of line?

End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.

What does '$' mean in RegEx?

$ means "Match the end of the string" (the position after the last character in the string).

What is \r and \n in RegEx?

\n. Matches a newline character. \r. Matches a carriage return character.


2 Answers

To match either / or end of content, use (/|\z)

This only applies if you are not using multi-line matching (i.e. you're matching a single URL, not a newline-delimited list of URLs).


To put that with an updated version of what you had:

/(\S+?)/(\d{4}-\d{2}-\d{2})-(\d+)(/|\z) 

Note that I've changed the start to be a non-greedy match for non-whitespace ( \S+? ) rather than matching anything and everything ( .* )

like image 149
Peter Boughton Avatar answered Oct 05 '22 16:10

Peter Boughton


You've got a couple regexes now which will do what you want, so that's adequately covered.

What hasn't been mentioned is why your attempt won't work: Inside a character class, $ (as well as ^, ., and /) has no special meaning, so [/$] matches either a literal / or a literal $ rather than terminating the regex (/) or matching end-of-line ($).

like image 25
Dave Sherohman Avatar answered Oct 05 '22 16:10

Dave Sherohman