I'm trying to create a pdf from a html page using wicked_pdf
(version 1.1) and wkhtmltopdf-binary
gems.
My html page contains a calendar emoji that displays well in the browser whatever font I use
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv='content-type' content='text/html; charset=utf-8' />
<style>
unicode {
font-family: 'OpenSansEmoji', sans-serif;
}
@font-face {
font-family: 'OpenSansEmoji';
src: url(data:font/truetype;charset=utf-8;base64,<-- encoded_font_base64_string-->) format('truetype');
}
</style>
</head>
<body>
<div><unicode>📅</unicode></div>
</body>
</html>
However, when I try to generate the PDF using the WickedPdf.new.pdf_from_html_file
method of the gem in the rails console,
File.open(File.expand_path('~/<--pdf_filename-->.pdf'), 'wb+') {|f| f.write WickedPdf.new.pdf_from_html_file('<--absolute_path_of_html_file-->')}
I get the following result:
PDF result with unknown character
As you can see, the first calendar icon is properly displayed, however there is a second character that is displayed, we do not know where it's coming from.
I have investigated through encoding in UTF-8 and UTF-16 and surrogate pair as suggested by this related post stackoverflow_emoji_wkhtmltopdf and looked at this issue wkhtmltopdf_git_issue but still can't make this character disappear!
If you have any clue, it's more than welcome.
Thanks in advance for your help!
EDIT
Following the comments from Eric Duminil and petkov.np, I can confirm - the code above works for me properly on Linux. Seems like this is a Linux vs MacOS issue. Can anyone suggest what the core of the issue in MacOS binding and whether it can be fixed?
I've edited this answer several times, please see the notes at the end as well as the comments.
I'm using macOs 10.12.2 and have the same issue. I'm listing all the browser etc. versions, although I suspect the biggest factor is the OS/wkhtmltopdf build.
I'm using the following example snippet:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html" charset="utf-8">
<style type="text/css">
p {
font-family: 'EmojiSymbols', sans-serif;
}
@font-face {
font-family: 'EmojiSymbols';
src: local('EmojiSymbols-Regular.woff'), url('EmojiSymbols-Regular.woff') format('woff');
}
span:before {
content: '\01F60B';
}
</style>
</head>
<body>
<p>
😋
<span></span>
😋
😋
😋
</p>
</body>
</html>
I'm calling wkhtmltopdf
with the --encoding 'UTF-8'
option.
You can see the rendered result here (I'm sorry for the lame screenshot). Some brief conclusions:
wkhtmltopdf
renders the raw bytes (sort of) ok, but doesn't render the CSS content
attribute properly. Every 'proper' occurrence of the unicode symbol is followed by this strange phantom symbol.I've tried literally everything but the results are the same. For me, the fact that even Safari doesn't render the raw bytes properly indicates some system-level problem that is macOS specific. It's unclear to me wether this should be reported as a wkhtmltopdf
issue or there is some misbehaved dependency in the macOs build.
EDIT: Safari seems to work fine, my markup was broken.
EDIT: A CSS
workaround may do the trick, please check the comments below.
FINAL EDIT: As shown in the comments, the CSS 'hack' that solves the issues is using text-rendering: optimizeLegibility;
. This seems to only be needed on macOS/OS X.
From my comment below:
I just found this issue. It seems irrelevant at first glance, but adding text-rendering: optimizeLegibility; to my styles removed the duplicate characters (on macOS). Why this happens is beyond me. As the issue author also uses osx, it's apparent there is some problem withwkhtmltopdf builds for this os.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With