Headless Chrome to print pdf

Tags:

I am trying to use Headless feature of the Chrome to convert a html to pdf. However, i am not getting output at all. Console doesn't show any error as well. I am running below commands in my windows m/c.

chrome --headless --disable-gpu --print-to-pdf

I tried all the various options. Nothing is being generated. I am having chrome version 60

950

asked Sep 06 '17 11:09

user2580925

1 Answers

Command Line --print-to-pdf

By default, --print-to-pdf attempts to create a PDF in the User Directory. By default, that user directory is where the actual chrome binary is stored, which is the specific version folder for the version you're running - for example, "C:\Program Files (x86)\Google\Chrome\Application\61.0.3163.100". And, by default... Chrome is not allowed to write to this folder. You can watch it try, and fail, by adding --enable-logging to your command.

So unfortunately, by default, this command fails.*

You can solve this by either providing a path in the argument, where Chrome can write - like

--print-to-pdf="C:\Users\Jane\test.pdf"

Or, you can change the User Directory:

--user-data-dir="C:\Users\Jane"

One reason you might prefer to change the User Directory is if you want the PDF to automatically receive its name from the webpage; Chrome looks at the title tag and then dumps it like <title>My Page</title> => My-Page.pdf

*I think this default behavior is super confusing, and should be filed as a bug against Chrome. However, apparently part of the Chrome team is outright opposed to the mere existence of this command line option, and instead believe it would be better to force everyone using it to get a node.js build going with Puppeteer and the flag removed outright.

Limitations of Command Line on Windows

Invoking chrome in this way will work fine for example in a local dev env on IIS Express with Visual Studio, but it will fail, even in headless mode, on a server running IIS, because IIS users are not given interactive/desktop permissions, and the way chrome grabs this PDF actually requires interactive/desktop permissions. There are complicated ways to provide those permissions, but anyplace you read up on how begins with DON'T PROVIDE INTERACTIVE/DESKTOP PERMISSIONS. Further, the above risk of Chrome one day getting rid of the command-line makes working even harder to get it working an iffy proposition.

Alternatives to chrome command line

wkhtmltopdf

Behind the scenes Chrome simply uses wkhtmltopdf. I haven't tried it but it's likely this will get the job done. The one minor risk is that when producing PDFs in Chrome, testing is obvious: View the page in Chrome. Open Print Preview if you're nervous. In wkhtmltopdf, it's actually a different build of Chromium, and that may produce rendering differences. Maybe.

Selenium

Another alternative is to get ahead of the group looking to get rid of --print-to-pdf and use the browser dev API (via Selenium) as they prefer.**

private static void pdfSeleniumImpl(string url, string pdfPath)
{
    var options = new OpenQA.Selenium.Chrome.ChromeOptions();
    options.AddArgument("headless");

    using (var chrome = new OpenQA.Selenium.Chrome.ChromeDriver(options))
    {
        chrome.Url = url;

        var printToPdfOpts = new Dictionary<string, object>();
        var resultDict = (Dictionary<string, object>)
            chrome.ExecuteChromeCommandWithResult(
                "Page.printToPDF", printToPdfOpts);
        dynamic result = new DDict(resultDict);
        string data = result.data;
        var pdfFile = Convert.FromBase64String(data);
        System.IO.File.WriteAllBytes(pdfPath, pdfFile);
    }
}

The DDict above is the GracefulDynamicDictionary from another of my answers.

https://www.nuget.org/packages/GracefulDynamicDictionary/

https://github.com/b9chris/GracefulDynamicDictionary

https://stackoverflow.com/a/24192518/176877

Ideally this would be async, since all the calls to Selenium are actually network commands, and writing that file could take a lot of Disk IO. The data returned from Chrome is actually a Stream as well. However Selenium's conventionally used library does not use async at all unfortunately, so it would take upgrading that library or identifying a solid async Selenium library for .Net to really do this right.

https://github.com/puppeteer/puppeteer/blob/master/lib/Page.js#L1007

https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF

**The Page.pdf chrome Dev API command is also deprecated, so if that contingent gets their way, neither the command line nor the Dev API will work. That said it looks like those lobbying to wreck it gave up 2 years ago.

answered Oct 17 '22 04:10

Chris Moschini

Related questions
                            
                                Why website printed with Chrome is using mobile layout?
                            
                                Chrome Fail Error Codes
                            
                                Google logo, map tiles and other google images don’t load in Chrome (HTTP 400)
                            
                                Error creating WebGL context. Three js chrome?
                            
                                Chromedriver only supports characters in the BMP error while sending Emoji with ChromeDriver Chrome using Selenium Python to Tkinter's label() textbox
                            
                                Retained Size in Chrome memory snapshot - what exactly is being retained?
                            
                                CSS: Disable Header & Footer From Print Preview Chrome
                            
                                Multiple 'X-Frame-Options' headers with conflicting values
                            
                                DELETE is not allowed by Access-Control-Allow-Methods
                            
                                Get latest release version number for chrome browser
                            
                                XAMPP Virtual Hosts not working
                            
                                Uncaught DOMException: Failed to read the 'rules' property from 'CSSStyleSheet'
                            
                                Chrome not loading latest version of web worker script (runs a cached version)
                            
                                Jquery param alternative for javascript
                            
                                debug javascript function on chrome console
                            
                                ::before with input in Firefox [duplicate]
                            
                                NET::ERR_CERT_WEAK_SIGNATURE_ALGORITHM error for google chrome stable version on ubuntu [closed]
                            
                                selenium tests fail against headless chrome
                            
                                Allow a Google Colab domain cookies on chrome
                            
                                Generating a tone using pure javascript with Chromium WebAudio API

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With