Chrome officially supports running the browser in headless mode (including programmatic control via the Puppeteer API and/or the CRI library).
I've searched through the documentation, but I haven't found how to programmatically capture the AJAX traffic from the instances (ie. start an instance of Chrome from code, navigate to a page, and access the background response/request calls & raw data (all from code not using the developer tools or extensions).
Do you have any suggestions or examples detailing how this could be achieved? Thanks!
Update
As @Alejandro pointed out in the comment, resourceType is a function and the return value is lowercased
page.on('request', request => {
if (request.resourceType() === 'xhr')
// do something
});
Original answer
Puppeteer's API makes this really easy:
page.on('request', request => {
if (request.resourceType === 'XHR')
// do something
});
You can also intercept requests with setRequestInterception, but it's not needed in this example if you're not going to modify the requests.
There's an example of intercepting image requests that you can adapt.
resourceTypes are defined here.
Puppeteer's listeners could help you capture xhr response via response and request event.
You should check wether request.resourceType() is xhr or fetch first.
listener = page.on('response', response => {
const isXhr = ['xhr','fetch'].includes(response.request().resourceType())
if (isXhr){
console.log(response.url());
response.text().then(console.log)
}
})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With