Is there a way to trigger the PDF export feature in PhantomJS without specifying an output file with the .pdf extension? We'd like to use stdout
to output the PDF.
Sorry for the extremely long answer; I have a feeling that I'll need to refer to this method several dozen times in my life, so I'll write "one answer to rule them all". I'll first babble a little about files, file descriptors, (named) pipes, and output redirection, and then answer your question.
Consider this simple C99 program:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[])
{
if (argc < 2) {
printf("Usage: %s file_name\n", argv[0]);
return 1;
}
FILE* file = fopen(argv[1], "w");
if (!file) {
printf("No such file: %s\n", argv[1]);
return 2;
}
fprintf(file, "some text...");
fclose(file);
return 0;
}
Very straightforward. It takes an argument (a file name) and prints some text into it. Couldn't be any simpler.
Compile it with clang write_to_file.c -o write_to_file.o
or gcc write_to_file.c -o write_to_file.o
.
Now, run ./write_to_file.o some_file
(which prints into some_file
). Then run cat some_file
. The result, as expected, is some text...
Now let's get more fancy. Type (./write_to_file.o /dev/stdout) > some_file
in the terminal. We're asking the program to write to its standard output (instead of a regular file), and then we're redirecting that stdout
to some_file
(using > some_file
). We could've used any of the following to achieve this:
(./write_to_file.o /dev/stdout) > some_file
, which means "use stdout
"
(./write_to_file.o /dev/stderr) 2> some_file
, which means "use stderr
, and redirect it using 2>
"
(./write_to_file.o /dev/fd/2) 2> some_file
, which is the same as above; stderr
is the third file descriptor assigned to Unix processes by default (after stdin
and stdout
)
(./write_to_file.o /dev/fd/5) 5> some_file
, which means "use your sixth file descriptor, and redirect it to some_file
"
In case it's not clear, we're using a Unix pipe instead of an actual file (everything is a file in Unix after all). We can do all sort of fancy things with this pipe: write it to a file, or write it to a named pipe and share it between different processes.
Now, let's create a named pipe:
mkfifo my_pipe
If you type ls -l
now, you'll see:
total 32
prw-r--r-- 1 pooriaazimi staff 0 Jul 15 09:12 my_pipe
-rw-r--r-- 1 pooriaazimi staff 336 Jul 15 08:29 write_to_file.c
-rwxr-xr-x 1 pooriaazimi staff 8832 Jul 15 08:34 write_to_file.o
Note the p at the beginning of second line. It means that my_pipe
is a (named) pipe.
Now, let's specify what we want to do with our pipe:
gzip -c < my_pipe > out.gz &
It means: gzip
what I put inside my_pipe
and write the results in out.gz
. The &
at the end asks the shell to run this command in the background. You'll get something like [1] 10449
and the control gets back to the terminal.
Then, simply redirect the output of our C program to this pipe:
(./write_to_file.o /dev/fd/5) 5> my_pipe
Or
./write_to_file.o my_pipe
You'll get
[1]+ Done gzip -c < my_pipe > out.gz
which means the gzip
command has finished.
Now, do another ls -l
:
total 40
prw-r--r-- 1 pooriaazimi staff 0 Jul 15 09:14 my_pipe
-rw-r--r-- 1 pooriaazimi staff 32 Jul 15 09:14 out.gz
-rw-r--r-- 1 pooriaazimi staff 336 Jul 15 08:29 write_to_file.c
-rwxr-xr-x 1 pooriaazimi staff 8832 Jul 15 08:34 write_to_file.o
We've successfully gzip
ed our text!
Execute gzip -d out.gz
to decompress this gzip
ed file. It will be deleted and a new file (out
) will be created. cat out
gets us:
some text...
which is what we expected.
Don't forget to remove the pipe with rm my_pipe
!
Now back to PhantomJS.
This is a simple PhantomJS script (render.coffee
, written in CoffeeScript) that takes two arguments: a URL and a file name. It loads the URL, renders it and writes it to the given file name:
system = require 'system'
renderUrlToFile = (url, file, callback) ->
page = require('webpage').create()
page.viewportSize = { width: 1024, height : 800 }
page.settings.userAgent = 'Phantom.js bot'
page.open url, (status) ->
if status isnt 'success'
console.log "Unable to render '#{url}'"
else
page.render file
delete page
callback url, file
url = system.args[1]
file_name = system.args[2]
console.log "Will render to #{file_name}"
renderUrlToFile "http://#{url}", file_name, (url, file) ->
console.log "Rendered '#{url}' to '#{file}'"
phantom.exit()
Now type phantomjs render.coffee news.ycombinator.com hn.png
in the terminal to render Hacker News front page into file hn.png
. It works as expected. So does phantomjs render.coffee news.ycombinator.com hn.pdf
.
Let's repeat what we did earlier with our C program:
(phantomjs render.coffee news.ycombinator.com /dev/fd/5) 5> hn.pdf
It doesn't work... :( Why? Because, as stated on PhantomJS's manual:
render(fileName)
Renders the web page to an image buffer and save it as the specified file.
Currently the output format is automatically set based on the file extension. Supported formats are PNG, JPEG, and PDF.
It fails, simply because neither /dev/fd/2
nor /dev/stdout
end in .PNG
, etc.
But no fear, named pipes can help you!
Create another named pipe, but this time use the extension .pdf
:
mkfifo my_pipe.pdf
Now, tell it to simply cat
its inout to hn.pdf
:
cat < my_pipe.pdf > hn.pdf &
Then run:
phantomjs render.coffee news.ycombinator.com my_pipe.pdf
And behold the beautiful hn.pdf
!
Obviously you want to do something more sophisticated that just cat
ing the output, but I'm sure it's clear now what you should do :)
Create a named pipe, using ".pdf" file extension (so it fools PhantomJS to think it's a PDF file):
mkfifo my_pipe.pdf
Do whatever you want to do with the contents of the file, like:
cat < my_pipe.pdf > hn.pdf
which simply cat
s it to hn.pdf
In PhantomJS, render to this file/pipe.
Later on, you should remove the pipe:
rm my_pipe.pdf
You can output directly to stdout without a need for a temporary file.
page.render('/dev/stdout', { format: 'pdf' });
See here for history on when this was added.
If you want to get HTML from stdin and output the PDF to stdout, see here
As pointed out by Niko you can use renderBase64()
to render the web page to an image buffer and return the result as a base64-encoded string.
But for now this will only work for PNG, JPEG and GIF.
To write something from a phantomjs script to stdout just use the filesystem API.
I use something like this for images :
var base64image = page.renderBase64('PNG');
var fs = require("fs");
fs.write("/dev/stdout", base64image, "w");
I don't know if the PDF format for renderBase64()
will be in a future version of phanthomjs but as a workaround something along these lines may work for you:
page.render(output);
var fs = require("fs");
var pdf = fs.read(output);
fs.write("/dev/stdout", pdf, "w");
fs.remove(output);
Where output
is the path to the pdf file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With