Regex: Substring the second last value between two slashes of a url string

Question

I have a string like this:

http://www.example.com/value/1234/different-value

How can I extract the 1234?

Note: There may be a slash at the end:

http://www.example.com/value/1234/different-value
http://www.example.com/value/1234/different-value/

Tim Pietzcker · Accepted Answer

/([^/]+)(?=/[^/]+/?$)

should work. You might need to format it differently according to the language you're using. For example, in Ruby, it's

if subject =~ /\/([^\/]+)(?=\/[^\/]+\/?\Z)/
    match = $~[1]
else
    match = ""
end

Todd A. Jacobs · Answer

Use Slice for Positional Extraction

If you always want to extract the 4th element (including the scheme) from a URI, and are confident that your data is regular, you can use Array#slice as follows.

'http://www.example.com/value/1234/different-value'.split('/').slice 4
#=> "1234"

'http://www.example.com/value/1234/different-value/'.split('/').slice 4
#=> "1234"

This will work reliably whether there's a trailing slash or not, whether or not you have more than 4 elements after the split, and whether or not that fourth element is always strictly numeric. It works because it's based on the element's position within the path, rather than on the contents of the element. However, you will end up with nil if you attempt to parse a URI with fewer elements such as http://www.example.com/1234/.

Use Scan/Match for Pattern Extraction

Alternatively, if you know that the element you're looking for is always the only one composed entirely of digits, you can use String#match with look-arounds to extract just the numeric portion of the string.

'http://www.example.com/value/1234/different-value'.match %r{(?<=/)\d+(?=/)}
#=> #<MatchData "1234">

$&
#=> "1234"

The look-behind and look-ahead assertions are needed to anchor the expression to a path. Without them, you'll match things like w3.example.com too. This solution is a better approach if the position of the target element may change, and if you can guarantee that your element of interest will be the only one that matches the anchored regex.

If there will be more than one match (e.g. http://www.example.com/1234/5678/) then you might want to use String#scan instead to select the first or last match. This is one of those "know your data" things; if you have irregular data, then regular expressions aren't always the best choice.

Regex: Substring the second last value between two slashes of a url string

Tags:

substring

regex

url

ruby

Daxon

2 Answers

Tim Pietzcker

Use Slice for Positional Extraction

Use Scan/Match for Pattern Extraction

Todd A. Jacobs

Recent Activity

Donate For Us

Regex: Substring the second last value between two slashes of a url string

Tags:

substring

regex

url

ruby

Daxon

2 Answers

Tim Pietzcker

Use Slice for Positional Extraction

Use Scan/Match for Pattern Extraction

Todd A. Jacobs

Related questions

Recent Activity

Donate For Us