Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Relative urls in a proxied website don't work

in PHP, I've written a proxy function that accepts a url, user agent, and other settings. Then the function makes a curl request for the website, and prints out that output with proper html content type headers into an iframe (this is necessary only because of my need to change some headers).

That proxied output often has lots of assets with relative URLS and actually inheret the hostname of my site, not the proxied site:

example: [http://MYSITE.com/proxy?url=http://somesite.com] would return the html of [http://somesite.com]

in the response html, there is stuff like this:

<link rel="apple-touch-icon-precomposed" sizes="144x144" href="assets/ico/apple-touch-icon-144-precomposed.png">

The problem:

Instead of the asset looking for that asset at http://somesite.com/assets/ico/apple-touch-icon-144-precomposed.png, it actually tries to find it at http://MYSITE.com/assets/ico/apple-touch-icon-144-precomposed.png which is wrong.

The Question:

What do i need to do to get their relative-path assets to load properly via the proxy?

like image 674
Kristian Avatar asked Oct 21 '12 03:10

Kristian


People also ask

Why is my relative URL not working?

Either check the URL is relative to file location box, or edit the URL to remove the drive letter. Such a link will work only on your computer. If you upload a file containing such a drive-letter linked file, the link will not work, since the C:// drive (or A:// drive) you've linked to isn't on the server.

Are relative URLs bad for SEO?

All sorts of SEO problems on the web are caused by the use of relative URLs in links, canonicals and more. We find issues with them in our website reviews on a regular basis, but as you can see bigger sites like Twitter also have massive issues because of them.

How do I link to a relative URL?

To link pages using relative URL in HTML, use the <a> tag with href attribute. Relative URL is used to add a link to a page on the website. For example, /contact, /about_team, etc.


1 Answers

How about the <base> tag? You can place it in the head and it will inform the browser what to use as the base path for all relative URLs on the page:

<head>
    <base href="http://somesite.com/">
</head>

You could add it to each page that you serve with DOMDocument (Note this is for PHP5.4 because of the array dereferencing, but that's easy fixed for earlier versions):

if($contentType == 'text/html') {
    $doc = DOMDocument::loadHTML($html);
    $head = $doc->getElementsByTagName('head')[0];

    if(count($head->getElementsByTagName('base')) == 0) {
        $base = DOMDocument::createElement('base');
        $base->setAttribute('href', $urlOfPageDir);
    }

    $head->appendChild($base);
    echo $doc->saveHTML();
}

Take note that $urlOfPageDir must be the absolute URL of the directory in which the page resides. See this SO question for more on the base tag: Is it recommended to use the <base> html tag?

like image 76
Bailey Parker Avatar answered Sep 22 '22 22:09

Bailey Parker