Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R markdown v2 to pdf. Conversion error when non-Latin characters in plots

Non-english characters inside plots are not displayed correctly. Here is a reproducible example.

---
title: "Untitled"
output:
  pdf_document:
    latex_engine: xelatex
  html_document:
    highlight: tango
    theme: null
---

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

```{r}
summary(cars)
```

You can also embed plots, for example:

```{r, echo=FALSE}
plot(cars, main="Τίτλος στα ελληνικά")
```

Knitting in pdf produces several lines like the following (which can be omitted of course using warning = FALSE) before the plot, which does not display the non-english title.

## Warning: conversion failure on 'Ξ¤ΞτλΞΟ‚ στα ελληνΞΞΞ¬' in 'mbcsToSbcs': dot substituted for <ce>
## Warning: conversion failure on 'Ξ¤ΞτλΞΟ‚ στα ελληνΞΞΞ¬' in 'mbcsToSbcs': dot substituted for <a4>
## Warning: conversion failure on 'Ξ¤ΞτλΞΟ‚ στα ελληνΞΞΞ¬' in 'mbcsToSbcs': dot substituted for <ce>
## Warning: conversion failure on 'Ξ¤ΞτλΞΟ‚ στα ελληνΞΞΞ¬' in 'mbcsToSbcs': dot substituted for

enter image description here

I have found out that specifying dev='cairo_pdf' in the chunk options, does the trick, but then the plot is not visible in the html output.

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253    LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                  LC_TIME=Greek_Greece.1253    

attached base packages:
 [1] datasets  grDevices splines   graphics  stats     grid      tcltk     utils     methods   base     

other attached packages:
 [1] tis_1.23            GGally_0.4.7        mratios_1.3.17      fpp_0.5             lmtest_0.9-33       expsmooth_2.02      fma_2.01           
 [8] tseries_0.10-32     forecast_5.4        xts_0.9-7           stringr_0.6.2       beeswarm_0.1.6      colorspace_1.2-4    latticeExtra_0.6-26
[15] RColorBrewer_1.0-5  amap_0.8-12         gridExtra_0.9.1     corrplot_0.73       psych_1.4.5         pgirmess_1.5.9      pastecs_1.3-18     
[22] boot_1.3-11         xtable_1.7-3        plyr_1.8.1          zoo_1.7-11          googleVis_0.4.5     RJSONIO_1.3-0       ggthemes_1.7.0     
[29] knitr_1.6           fBasics_3010.86     timeSeries_3010.97  timeDate_3010.98    MASS_7.3-33         RODBC_1.3-10        car_2.0-20         
[36] sos_1.3-8           brew_1.0-6          reshape2_1.4        scales_0.2.4        ggplot2_1.0.0       svSocket_0.9-57     TinnR_1.0-5        
[43] R2HTML_2.2.1        Hmisc_3.14-4        Formula_1.1-2       survival_2.37-7     lattice_0.20-29    

loaded via a namespace (and not attached):
 [1] cluster_1.15.2   coda_0.16-1      deldir_0.1-6     digest_0.6.4     evaluate_0.5.5   formatR_0.10     fracdiff_1.4-2   gtable_0.1.2     htmltools_0.2.4 
[10] labeling_0.2     LearnBayes_2.15  Matrix_1.1-4     munsell_0.4.2    mvtnorm_1.0-0    nlme_3.1-117     nnet_7.3-8       parallel_3.1.1   proto_0.3-10    
[19] quadprog_1.5-5   Rcpp_0.11.2      reshape_0.8.5    rgdal_0.8-16     rmarkdown_0.2.54 sp_1.0-15        spdep_0.5-74     splancs_2.01-34  stabledist_0.6-6
[28] svMisc_0.9-70    tools_3.1.1      yaml_2.1.13 
like image 459
Brani Avatar asked Aug 08 '14 07:08

Brani


2 Answers

You can check the value of rmarkdown.pandoc.to option, during knit, and set the device dynamically:

my_output <- knitr::opts_knit$get("rmarkdown.pandoc.to")

if (my_output == "latex"){
  opts_chunk$set(dev='cairo_pdf', dev.args=list(cairo_pdf = list(family='Times New Roman')))
}
like image 195
George Dontas Avatar answered Oct 31 '22 14:10

George Dontas


If you run the task as two steps in the console, bypassing RStudio's button, it seems to work for HTML.

require(knitr)
require(markdown)
knit('myfile.Rmd',encoding="UTF-8")
markdownToHtml('myfile.md','myfile.html')

My session:

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] markdown_0.7 knitr_1.6   

loaded via a namespace (and not attached):
[1] colorspace_1.2-4 digest_0.6.4     evaluate_0.5.5   formatR_0.10     ggplot2_1.0.0            grid_3.1.0      
[7] gtable_0.1.2     htmltools_0.2.4  lattice_0.20-29  MASS_7.3-33      mime_0.1.1       munsell_0.4.2   
[13] plyr_1.8.1       proto_0.3-10     Rcpp_0.11.2      reshape2_1.4     rmarkdown_0.2.49 scales_0.2.4    
[19] stringr_0.6.2    tools_3.1.0      yaml_2.1.13 

enter image description here

Edit: The above succeeds for me on Windows/MINGW if the locale is set as well:

Sys.setlocale("LC_ALL","greek") 
knit2html('myfile.Rmd',encoding="UTF-8") 
sessionInfo() 

My session:

R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253   
[3] LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                 
[5] LC_TIME=Greek_Greece.1253    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] knitr_1.6

loaded via a namespace (and not attached):
[1] evaluate_0.5.5  formatR_0.10    grid_3.1.0      lattice_0.20-29 markdown_0.7.2 
[6] mime_0.1.2      stringr_0.6.2   tools_3.1.0    
like image 43
mrbcuda Avatar answered Oct 31 '22 15:10

mrbcuda