Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby regexp: capture the path of url

Tags:

regex

ruby

From any URL I want to extract its path.

For example:

URL: https://stackoverflow.com/questions/ask Path: questions/ask

It shouldn't be difficult:

url[/(?:\w{2,}\/).+/]

But I think I use a wrong pattern for 'ignore this' ('?:' - doesn't work). What is the right way?

like image 294
krn Avatar asked Feb 26 '11 20:02

krn


2 Answers

I would suggest you don't do this with a regular expression, and instead use the built in URI lib:

require 'uri'

uri = URI::parse('http://stackoverflow.com/questions/ask')

puts uri.path # results in: /questions/ask

It has a leading slash, but thats easy to deal with =)

like image 77
ctcherry Avatar answered Sep 22 '22 05:09

ctcherry


You can use regex in this case, which is faster than URI.parse:

s = 'http://stackoverflow.com/questions/ask'

s[s[/.*?\/\/[^\/]*\//].size..-1]
# => "questions/ask"  (6,8 times faster)

s[/\/(?!.*\.).*/]
# => "/questions/ask" (9,9 times faster, but with an extra slash)

But if you don't care with the speed, use uri, as ctcherry showed, is more readable.

like image 25
Guilherme Bernal Avatar answered Sep 18 '22 05:09

Guilherme Bernal