Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php relative urls to absolute urls conversion with eventually base href html tag [duplicate]

Tags:

html

url

php

I've a page loaded with DOM, then I want to convert all relative URLs of anchors to absolute URLs according to, eventually, the <base href> tag

I'm looking for something tested, not some random script that fails on some cases

I'm interested in parsing of every form of href="" usage:

href="relative.php"
href="/absolute1.php"
href="./relative.php"
href="../relative.php"
href="//absolutedomain.org"
href="." relative
href=".." relative
href="../" relative
href="./" relative

and more complex ones mixed

thank you in advance

like image 775
skyline26 Avatar asked Nov 18 '25 23:11

skyline26


1 Answers

This function will resolve relative URLs to a given current page url in $pgurl without regex. It successfully resolves:

/home.php?example types,

same-dir nextpage.php types,

../...../.../parentdir types,

full http://example.net urls,

and shorthand //example.net urls

//Current base URL (you can dynamically retrieve from $_SERVER)
$pgurl = 'http://example.com/scripts/php/absurl.php';

function absurl($url) {
 global $pgurl;
 if(strpos($url,'://')) return $url; //already absolute
 if(substr($url,0,2)=='//') return 'http:'.$url; //shorthand scheme
 if($url[0]=='/') return parse_url($pgurl,PHP_URL_SCHEME).'://'.parse_url($pgurl,PHP_URL_HOST).$url; //just add domain
 if(strpos($pgurl,'/',9)===false) $pgurl .= '/'; //add slash to domain if needed
 return substr($pgurl,0,strrpos($pgurl,'/')+1).$url; //for relative links, gets current directory and appends new filename
}

function nodots($path) { //Resolve dot dot slashes, no regex!
 $arr1 = explode('/',$path);
 $arr2 = array();
 foreach($arr1 as $seg) {
  switch($seg) {
   case '.':
    break;
   case '..':
    array_pop($arr2);
    break;
   case '...':
    array_pop($arr2); array_pop($arr2);
    break;
   case '....':
    array_pop($arr2); array_pop($arr2); array_pop($arr2);
    break;
   case '.....':
    array_pop($arr2); array_pop($arr2); array_pop($arr2); array_pop($arr2);
    break;
   default:
    $arr2[] = $seg;
  }
 }
 return implode('/',$arr2);
}

Usage Example:

echo nodots(absurl('../index.html'));

nodots() must be called after the URL is converted to absolute.

The dots function is kind of redundant, but is readable, fast, doesn't use regex's, and will resolve 99% of typical urls (if you want to be 100% sure, just extend the switch block to support 6+ dots, although I've never seen that many dots in a URL).

Hope this helps,

like image 186
Aaron Gillion Avatar answered Nov 21 '25 13:11

Aaron Gillion



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!