Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a regex to extract a number from these URLs?

Tags:

regex

php

I'm trying to write a regex to match the numbers in these URLs (12345678 and 1234567890).

http://www.example.com/p/12345678
http://www.example.com/p/12345678?foo=bar
http://www.example.com/p/some-text-123/1234567890?foo=bar

Rules:

  • the numbers always come after a slash
  • the numbers can be varying lengths
  • the regex must check that the URLs have /p/ in them
  • the numbers may be at the end of the URL, or there could be variables after them

My attempt:

\/p\/([0-9]+)

That matches the first and second, but not the third. So I tried:

\/p\/[^\/?]*\/?([0-9]+)

No joy.

REGEX 101

like image 506
Nate Avatar asked Nov 10 '22 21:11

Nate


2 Answers

Regex might not be the right tool for this job. It looks like in every case, splitting the URL with a URL parser would make more sense. From your examples, it appears that the number portion is always the last item in the path portion of the URL. I'm not sure what language you're using, but many languages offer functions that can parse URLs into their constituent parts.

$path = parse_url($url, PHP_URL_PATH);
if(strpos($path, "/p/") === 0) {
    $base = basename($path);
} else {
    // error
}

Works every time, assuming $url is the string you are parsing.

like image 73
superultranova Avatar answered Nov 14 '22 22:11

superultranova


I extended your version, it now works with all examples:

\/p\/(.+\/)*(\d+)(\?.+=.+(&.+=.+)*)?$

If you don't care that the URL is valid, you could shrink the regex to:

\/p\/(.+\/)*(\d+)($|\?)

https://regex101.com/r/pW5qB3/2

like image 39
msrd0 Avatar answered Nov 14 '22 21:11

msrd0