Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the HTTP response body using chromedp?

Tags:

Using github.com/knq/chromedp, a go package to drive web browsers using Chrome Debugging Protocol, I can navigate to webpages, update forms and submit forms, but I need to retrieve a HTTP response body and haven't figured out how to yet. I'd like to be able to retrieve the HTTP response body for a JSON response (not HTML).

From looking in the code, it seems the HTTP response body is in the CachedResponse.Body property:

https://github.com/knq/chromedp/blob/b9e4c14157325be092c1c1137edbd584648d8c72/cdp/cachestorage/types.go#L30

And that it should be accessible using:

func (p *RequestCachedResponseParams) Do(ctxt context.Context, h cdp.Handler) (response *CachedResponse, err error)

https://github.com/knq/chromedp/blob/b9e4c14157325be092c1c1137edbd584648d8c72/cdp/cachestorage/cachestorage.go#L168

The examples use cdp.Tasks such as the following from the simple example.

func googleSearch(q, text string, site, res *string) cdp.Tasks {
    var buf []byte
    sel := fmt.Sprintf(`//a[text()[contains(., '%s')]]`, text)
    return cdp.Tasks{
        cdp.Navigate(`https://www.google.com`),
        cdp.Sleep(2 * time.Second),
        cdp.WaitVisible(`#hplogo`, cdp.ByID),
        cdp.SendKeys(`#lst-ib`, q+"\n", cdp.ByID),
        cdp.WaitVisible(`#res`, cdp.ByID),
        cdp.Text(sel, res),
        cdp.Click(sel),
        cdp.Sleep(2 * time.Second),
        cdp.WaitVisible(`#footer`, cdp.ByQuery),
        cdp.WaitNotVisible(`div.v-middle > div.la-ball-clip-rotate`, cdp.ByQuery),
        cdp.Location(site),
        cdp.Screenshot(`#testimonials`, &buf, cdp.ByID),
        cdp.ActionFunc(func(context.Context, cdptypes.Handler) error {
            return ioutil.WriteFile("testimonials.png", buf, 0644)
        }),
    }
}

https://github.com/knq/chromedp/blob/b9e4c14157325be092c1c1137edbd584648d8c72/examples/simple/main.go

It seems like the CachedResponse.Body can be accessed by calling RequestCachedResponseParams.Do() by referencing RequestCachedResponseParams.CacheID, but the following is still needed::

  1. how to call RequestCachedResponseParams.Do() in cdp.Tasks - seems possible using cdp.ActionFunc()
  2. how to get access to RequestCachedResponseParams.CacheID
like image 212
Grokify Avatar asked Aug 22 '17 04:08

Grokify


1 Answers

If you want to get request response, that's how I managed to do it.

This sample call http://www.google.com and listen EventResponseReceived to keep Response that contains Headers for example.

package main

import (
    "context"
    "io/ioutil"
    "log"
    "os"
    "time"

    "github.com/chromedp/cdproto/network"
    "github.com/chromedp/chromedp"
)

func main() {
    dir, err := ioutil.TempDir("", "chromedp-example")
    if err != nil {
        panic(err)
    }
    defer os.RemoveAll(dir)

    opts := append(chromedp.DefaultExecAllocatorOptions[:],
        chromedp.DisableGPU,
        chromedp.NoDefaultBrowserCheck,
        chromedp.Flag("headless", false),
        chromedp.Flag("ignore-certificate-errors", true),
        chromedp.Flag("window-size", "50,400"),
        chromedp.UserDataDir(dir),
    )

    allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
    defer cancel()

    // also set up a custom logger
    taskCtx, cancel := chromedp.NewContext(allocCtx, chromedp.WithLogf(log.Printf))
    defer cancel()

    // create a timeout
    taskCtx, cancel = context.WithTimeout(taskCtx, 10*time.Second)
    defer cancel()

    // ensure that the browser process is started
    if err := chromedp.Run(taskCtx); err != nil {
        panic(err)
    }

    // listen network event
    listenForNetworkEvent(taskCtx)

    chromedp.Run(taskCtx,
        network.Enable(),
        chromedp.Navigate(`http://www.google.com`),
        chromedp.WaitVisible(`body`, chromedp.BySearch),
    )

}

func listenForNetworkEvent(ctx context.Context) {
    chromedp.ListenTarget(ctx, func(ev interface{}) {
        switch ev := ev.(type) {

        case *network.EventResponseReceived:
            resp := ev.Response
            if len(resp.Headers) != 0 {
                log.Printf("received headers: %s", resp.Headers)

            }

        }
        // other needed network Event
    })
}
like image 129
morpheus0010 Avatar answered Oct 01 '22 03:10

morpheus0010