Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode Characters in ggplot2 PDF Output

How can I use Unicode characters for labels, titles and similar things in a PDF plot created with ggplot2?

Consider the following example:

library(ggplot2) qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ") ggsave("t.pdf") 

The title of the plot uses Unicode characters (small caps), which in the output appear as .... The problem occurs only with pdf plots; if I replace the last line with ggsave("t.png"), then the output is as expected.

What am I doing wrong? The R script I have is in UTF-8 encoding. Some system information:

R version 2.14.1 (2011-12-22) Platform: x86_64-pc-linux-gnu (64-bit)  locale:  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C                [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8      [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8     [7] LC_PAPER=C                 LC_NAME=C                   [9] LC_ADDRESS=C               LC_TELEPHONE=C             [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C         attached base packages: [1] stats     graphics  grDevices utils     datasets  methods   base 

When searching for a solution for this problem, I found some evidence that R uses a single-byte encoding for mutli-byte encodigns such as UTF-8 for PDF or postscript output. I also found suggestions to, for instance, be able to get the Euro sign working, but no general solution.

like image 236
stefan Avatar asked Oct 07 '12 10:10

stefan


Video Answer


1 Answers

As Ben suggested, cairo_pdf() is your friend. It also allows you to embed non-postscript fonts (i.e. TTF/OTF) in the PDF via the family argument (crucial if you don't happen to have any postscript fonts that contain the glyphs you want to use). For example:

library(ggplot2) cairo_pdf("example.pdf", family="DejaVu Sans") qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ") dev.off() 

...gives a PDF that looks like this: ggplot2 graph with custom font family and non-ASCII characters in the title

See also this question; though it doesn't look directly relevant from the title, there is a lot in there about getting fonts to do what you want in R.

EDIT per request in comments, here is the windows-specific code:

library(ggplot2) windowsFonts(myCustomWindowsFontName=windowsFont("DejaVu Sans")) cairo_pdf("example.pdf", family="myCustomWindowsFontName") qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ") dev.off() 

To use the base graphics command cairo_pdf() it should suffice to just define your font family with the windowsFonts() command first, as shown above. Of course, make sure you use a font that you actually have on your system, and that actually has all the glyphs that you need.

TThe instructions about DLL files in the comments below are what I had to do to get the Cairo() and CairoPDF() commands in library(Cairo) to work on Windows. Then:

library(ggplot2) library(Cairo) windowsFonts(myCustomWindowsFontName=windowsFont("DejaVu Sans")) CairoPDF("example.pdf") par(family="myCustomWindowsFontName") qplot(Sepal.Length, Petal.Length, data=iris, main="Aʙᴄᴅᴇғɢʜɪᴊᴋʟᴍɴᴏᴘǫʀsᴛᴜᴠᴡxʏᴢ") dev.off() 
like image 99
drammock Avatar answered Sep 29 '22 14:09

drammock