Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to intercept downloads of blob generated in client side of website through puppeteer?

I have a page on this link (https://master.d3tei1upkyr9mb.amplifyapp.com/report) with 3 export buttons. These export buttons generate XLSX, CSV, PDF on the frontend, and hence there are no URLs for XLSX, CSV, PDF.

I need puppeteer to be able to download or get or intercept the blobs or buffers of these files in my node backend.

I tried different ways to achieve this but still haven't figured out.

It was possible through playwright library through the code written below. But I need to be able to do it with Puppeteer.

const {chromium} = require('playwright');
const fs = require('fs');

(async () => {
    const browser = await chromium.launch();
    const context = await browser.newContext({acceptDownloads: true});
    const page = await context.newPage();

    await page.goto('http://localhost:3000/');

    const [ download ] = await Promise.all([
        page.waitForEvent('download'), // <-- start waiting for the download
        page.click('button#expoXLSX') // <-- perform the action that directly or indirectly initiates it.
    ]);

    const path = await download.path();

    console.log(path);

    const newFile = await fs.readFileSync(path);

    console.log(newFile);

    fs.writeFile("test.xlsx", newFile,  "binary",function(err) {
        if(err) {
            console.log(err);
        } else {
            console.log("The file was saved!");
        }
    });

    await browser.close()
})();

Is there any way?

like image 532
Pi-An- Up Avatar asked Nov 06 '22 03:11

Pi-An- Up


1 Answers

Any reason not to simulate the click on the frontend and allow puppeteer download the file to the location of your choice? You can easily download the file this way with the following:

Edit: You can determine when the file download completes by listening to the Page.downloadProgress event and checking for the completed state. Getting the actual filename saved to disk isn't 100% guaranteed with this method, but you are able to get what is termed the suggestedFileName from the Page.downloadWillBegin event, which in my tests thus far (at least on the example page in the question) does match the filename persisted to disk.

const puppeteer = require('puppeteer');
const path = require('path');
const downloadPath = path.resolve('./download');


(async ()=> {
  let fileName;
  const browser = await puppeteer.launch({
      headless: false
  });
  
  const page = await browser.newPage();
  await page.goto(
      'https://master.d3tei1upkyr9mb.amplifyapp.com/report', 
      { waitUntil: 'networkidle2' }
  );
  
  await page._client.send('Page.setDownloadBehavior', {
      behavior: 'allow',
      downloadPath: downloadPath 
  });

  await page._client.on('Page.downloadWillBegin', ({ url, suggestedFilename }) => {
    console.log('download beginning,', url, suggestedFilename);
    fileName = suggestedFilename;
  });

  await page._client.on('Page.downloadProgress', ({ state }) => {
    if (state === 'completed') {
      console.log('download completed. File location: ', downloadPath + '/' + fileName);
    }
  });

  await page.click('button#expoPDF');
})();
like image 81
willascend Avatar answered Nov 15 '22 07:11

willascend