Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Go lang Capture Redirect urls and status codes with timeouts

Tags:

redirect

http

go

I am trying to make a request to a given url, and capture the redirect urls and their status codes that were followed.

I've tried looking for an answer to my specific question - this came close .

However, I need to also add proxy, user agent and timeouts on the entire connection i.e. No matter how many redirects / proxy latency etc, the amount of time should not exceed X seconds.

I've handled user-agent by setting request header, and proxy by adding it to the Transport struct. I tried exploring CheckRedirect for redirects - but that gives me only Url, I needed the status code as well, so I had to implement the RoundTrip function.

Everything works well as of now - except for the Timeout. Heres what I have so far - playground link I've pasted the relevant code here as well - the playground has a full version with a mock redirect server in place - Unfortunately it panics saying connection refused possibly because of playground restrictions - It works completely locally though.

type Redirect struct {
    StatusCode int
    URL string
}

type TransportWrapper struct {
    Transport http.RoundTripper
    Url string
    Proxy string
    UserAgent string
    TimeoutInSeconds int
    FinalUrl string
    RedirectUrls []Redirect
}
// Implementing Round Tripper to capture intermediate urls
func (t *TransportWrapper) RoundTrip(req *http.Request) (*http.Response, error) {
    transport := t.Transport
    if transport == nil {
        transport = http.DefaultTransport
    }

    resp, err := transport.RoundTrip(req)
    if err != nil {
        return resp, err
    }

    // Remember redirects
    if resp.StatusCode >= 300 && resp.StatusCode <= 399 {
        t.RedirectUrls = append(
            t.RedirectUrls, Redirect{resp.StatusCode, req.URL.String()},
        )
    }
    return resp, err
}

func (t *TransportWrapper) Do() (*http.Response, error) {
    t.Transport = &http.Transport{}
    if t.Proxy != "" {
        proxyUrl, err := url.Parse(t.Proxy)
        if err != nil {
            return nil, err
        }

        t.Transport = &http.Transport{Proxy:http.ProxyURL(proxyUrl)}
        // HELP
        // Why does this fail
        // t.Transport.Proxy = http.ProxyUrl(proxyUrl)
    }

    client := &http.Client{
        Transport: t, // Since I've implemented RoundTrip I can pass this
        // Timeout: t.TimeoutInSeconds * time.Second, // This Fails 
    }

    req, err := http.NewRequest("GET", t.Url, nil)
    if err != nil {
        return nil, err
    }

    if t.UserAgent != "" {
        req.Header.Set("User-Agent", t.UserAgent)
    }

    resp, err := client.Do(req)
    if err != nil {
        return nil, err
    }

    t.FinalUrl = resp.Request.URL.String()
    return resp, nil
}

func startClient() {
    t := &TransportWrapper {
        Url: "http://127.0.0.1:8080/temporary/redirect?num=5",
        // Proxy
        // UserAgent
        // Timeout
    }

    _, err := t.Do()
    if err != nil {
        panic(err)
    }

    fmt.Printf("Intermediate Urls: \n")
    for i, v := range t.RedirectUrls {
        fmt.Printf("[%d] %s\n", i, v)
    }

}

Question 1 : How do I Add the timeout ?

Attempt #1 :

client := &http.Client{ Transport: t, Timeout: myTimeout }

But Go complains saying " *main.TransportWrapper doesn't support CancelRequest; Timeout not supported "

Attempt #2 :

// Adding a CancelRequest
func (t *TransportWrapper) CancelRequest(req *http.Request) {
    dt := http.DefaultTransport
    dt.CancelRequest(req)
}

But Go complains saying "dt.CancelRequest undefined (type http.RoundTripper has no field or method CancelRequest)"

How do I implement this CancelRequest without doing too much and just let default CancelRequest take over ?

Question 2 : Have I gone down a bad path and is there an alternative to solving the problem ,

Given a Url, Proxy, UserAgent and Timeout - return the response along with the redirect urls and their status codes followed to get there.

I hope I've worded this appropriately.

Thanks

like image 242
Abhishek Shivanna Avatar asked Sep 27 '22 23:09

Abhishek Shivanna


1 Answers

There is already a hook for checking redirects, Client.CheckRedirect.

You can supply a callback to do what you want.

If you really want to create you're own transport to extend other functionality, you would need to supply a CancelRequest method like the error says to handle Client.Timeout.

func (t *TransportWrapper) CancelRequest(req *Request) {
    t.Transport.CancelRequest(req)
}

More commonly, you would embed the Transport, so that all the methods and fields are automatically promoted. You should avoid writable fields in the transport however, since it's expected to be safe to use concurrently, otherwise you should have all access protected with a mutex, or you must make sure it's only used in one goroutine.

A minimal example would look like:

type TransportWrapper struct {
    *http.Transport
    RedirectUrls []Redirect
}

func (t *TransportWrapper) RoundTrip(req *http.Request) (*http.Response, error) {
    transport := t.Transport
    if transport == nil {
        transport = http.DefaultTransport.(*http.Transport)
    }

    resp, err := transport.RoundTrip(req)
    if err != nil {
        return resp, err
    }

    // Remember redirects
    if resp.StatusCode >= 300 && resp.StatusCode <= 399 {
        fmt.Println("redirected")
        t.RedirectUrls = append(
            t.RedirectUrls, Redirect{resp.StatusCode, req.URL.String()},
        )
    }
    return resp, err
}

And you can then use the timeout in the client:

client := &http.Client{
    Transport: &TransportWrapper{
        Transport: http.DefaultTransport.(*http.Transport),
    },
    Timeout: 5 * time.Second,
}
like image 126
JimB Avatar answered Oct 03 '22 04:10

JimB