Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can we download a webpage completely with chrome.downloads.download? (Google Chrome Extension)

I want to save a wabpage completely from my Google Chrome extension. I added "downloads", "<all_urls>" permissions and confirmed that the following code save the Google page to google.html.

  chrome.downloads.download(
            { url: "http://www.google.com",
              filename: "google.html" },
            function (x) { console.log(x); })

However, this code only saves the html file. Stylesheets, scripts and images are not be saved. I want to save the webpage completely, as if I save the page with the dialog, selecting Format: Webpage, Complete.

I looked into the document but I couldn't find a way.

So my question is: how can I download a webpage completely from an extension using the api(s) of Google Chrome?

like image 711
itchyny Avatar asked Jul 10 '14 08:07

itchyny


People also ask

Can you download an entire website?

You can also right-click anywhere on the page and select Save as or use the keyboard shortcut Ctrl + S in Windows or Command + S in macOS. Chrome can save the complete web page, including text and media assets, or just the HTML text.


1 Answers

The downloads API downloads a single resource only. If you want to save a complete web page, then you can first open the web page, then export it as MHTML using chrome.pageCapture.saveAsMHTML, create a blob:-URL for the exported Blob using URL.createObjectURL and finally save this URL using the chrome.downloads.download API.

The pageCapture API requires a valid tabId. For instance:

// Create new tab, wait until it is loaded and save the page
chrome.tabs.create({
    url: 'http://example.com'
}, function(tab) {
    chrome.tabs.onUpdated.addListener(function func(tabId, changeInfo) {
        if (tabId == tab.id && changeInfo.status == 'complete') {
            chrome.tabs.onUpdated.removeListener(func);
            savePage(tabId);
        }
    });
});

function savePage(tabId) {
    chrome.pageCapture.saveAsMHTML({
        tabId: tabId
    }, function(blob) {
        var url = URL.createObjectURL(blob);
        // Optional: chrome.tabs.remove(tabId); // to close the tab
        chrome.downloads.download({
            url: url,
            filename: 'whatever.mhtml'
        });
    });
}

To try out, put the previous code in background.js,
add the permissions to manifest.json (as shown below) and reload the extension. Then example.com will be opened, and the web page will be saved as a self-contained MHTML file.

{
    "name": "Save full web page",
    "version": "1",
    "manifest_version": 2,
    "background": {
        "scripts": ["background.js"]
    },
    "permissions": [
        "pageCapture",
        "downloads"
    ]
}
like image 129
Rob W Avatar answered Sep 28 '22 09:09

Rob W