I want to know how I can cache a file with puppeteer, so I don't have to load it again when the script starts, assuming I have this script:
async function run () {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://www.amazon.com/");
browser.close();
}
run();
Well, if I wanted to save the html so it wouldn't be necessary to load it again, how would I do it? I researched and found How can I disable cache in puppeteer? but I didn't find many details neither in the answer, nor in the question, could someone explain to me how to save the html in cache for example?
Puppeteer uses Chrome (or FireFox) browser under the hood, so in case:
await page.setCacheEnabled(false);
await pageSession.send('Network.setCacheDisabled', { cacheDisabled: true });
Resources will be already cached and you don't need to do anything manually.
However, if you want to do testing on cached page, you will need to warm it up simply pre visiting it before tests, like in the example:
async function warmingBrowser(url: URL, pageInstance: Page) {
await pageInstance.goto(url.href, { waitUntil: 'networkidle0' });
await pageInstance.close();
}
The code is taken from the perfrunner
In case you want to make it work completely offline - Puppeteer will not help with that, you need to implement your own caching strategy using the ServiceWorker.
But there are some pitfalls on this step (exactly with caching and invalidating the cache) so be aware.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With