Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert relative to absolute URLs in Go

I'm writing a little web crawler, and a lot of the links on sites I'm crawling are relative (so they're /robots.txt, for example). How do I convert these relative URLs to absolute URLs (so /robots.txt => http://google.com/robots.txt)? Does Go have a built-in way to do this?

like image 244
hiy Avatar asked Dec 09 '18 12:12

hiy


People also ask

How do you convert relative path to absolute path?

The absolutePath function works by beginning at the starting folder and moving up one level for each "../" in the relative path. Then it concatenates the changed starting folder with the relative path to produce the equivalent absolute path.

Is relative URL a part of absolute URL?

An absolute URL contains all the information necessary to locate a resource. A relative URL locates a resource using an absolute URL as a starting point. In effect, the "complete URL" of the target is specified by concatenating the absolute and relative URLs.

Is absolute URL better than relative?

An absolute URL contains more information than a relative URL does. Relative URLs are more convenient because they are shorter and often more portable. However, you can use them only to reference links on the same server as the page that contains them.


2 Answers

Yes, the standard library can do this with the net/url package. Example (from the standard library):

package main

import (
    "fmt"
    "log"
    "net/url"
)

func main() {
    u, err := url.Parse("../../..//search?q=dotnet")
    if err != nil {
        log.Fatal(err)
    }
    base, err := url.Parse("http://example.com/directory/")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(base.ResolveReference(u))
}

Notice that you only need to parse the absolute URL once and then you can reuse it over and over.

like image 64
Not_a_Golfer Avatar answered Sep 29 '22 08:09

Not_a_Golfer


On top of @Not_a_Golfer's solution.

You can also use base URL's Parse method to provide a relative or absolute URL.

package main

import (
    "fmt"
    "log"
    "net/url"
)

func main() {
    // parse only base url
    base, err := url.Parse("http://example.com/directory/")
    if err != nil {
        log.Fatal(err)
    }

    // and then use it to parse relative URLs
    u, err := base.Parse("../../..//search?q=dotnet")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println(u.String())
}

Try it on Go Playground.

like image 31
KenanBek Avatar answered Sep 29 '22 10:09

KenanBek