Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sweave not printing localized characters

Tags:

r

pdf

sweave

I'm trying to incorporate some plots from R in my LaTeX document through Sweave.

\SweaveOpts{eps = FALSE, pdf = TRUE, echo = FALSE, prefix = TRUE, prefix.string = data}

<<label = abundanca:barplot, fig = TRUE, include = FALSE, results = hide>>=
barplot(abund, xlab="Vzorčne postaje", ylab="Abundanca", main="", col="slategrey", names.arg=c("HM1", "HM2", "HM3", "HM4", "HM5", "HM6", "HM7", "HM8", "HM9", "HM10"))
@

The pdf device in Sweave uses native encoding (as set in options("encoding")), which doesn't recognize local characters (ščćž) in xlab (replaces them with two dots).

I have tried setting the option to something that works in R:

options("encoding" = "CP1250.enc")

but I get an error:

Error in file() : unsupported conversion from 'CP1250.enc' to ''

Any solutions, workarounds...?

EDIT

Running aL3xa's Rnw through

R CMD Sweave report.Rnw

doesn't work.

Running the same file through Eclipse+StatET

Sweave("report.Rnw")

however, does.

My .Rnw file and a .pdf.

like image 619
Roman Luštrik Avatar asked Aug 08 '10 12:08

Roman Luštrik


1 Answers

This one is not as simple as it may seem. Technically, this problem is OS/locale/pdf writer/Sweave dependent (see "R Installation and Administration", chapter 7). Since I'm running GNU/Linux, this "solution" is not addressed to Mac and Windows users. And just to make things a bit more complicated, GNU/Linux distros differ, so there's a great chance that if something works on, say, Ubuntu, doesn't work on Arch Linux.

I'll be using mtcars dataset. Let's create some basic graph with localized characters:

pdf("foo.pdf")
boxplot(mpg ~ cyl, data = mtcars, ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja")
dev.off()

(crash-course of Serbian language: "Potrošnja goriva" stands for fuel consumption, "Broj cilindara" stands for # of cylinders and "Dijagram raspršenja" is equivalent for scatterplot)

Now, I get bunch of warnings:

Warning messages:
1: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Dijagram raspršenja' in 'mbcsToSbcs': dot substituted for <c5>
2: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Dijagram raspršenja' in 'mbcsToSbcs': dot substituted for <a1>
3: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Dijagram raspršenja' in 'mbcsToSbcs': dot substituted for <c5>
4: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Dijagram raspršenja' in 'mbcsToSbcs': dot substituted for <a1>
5: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Potrošnja goriva' in 'mbcsToSbcs': dot substituted for <c5>
6: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Potrošnja goriva' in 'mbcsToSbcs': dot substituted for <a1>
7: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Potrošnja goriva' in 'mbcsToSbcs': dot substituted for <c5>
8: In title(ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja") :
  conversion failure on 'Potrošnja goriva' in 'mbcsToSbcs': dot substituted for <a1>

While options(encoding = "CP1250") doesn't do the trick - I get the same warnings, pdf.options(encoding = "CP1250") mends it, and the same stands for pdf(file = "foo.pdf", encoding = "CP1250"). So, I'll get back my old encoding with options(encoding = "native.enc"), set pdf.options as previously stated and get things right.

Some users get away just by setting pdf.options, and get no problems with Sweave. So, you should insert this part of code somewhere in .Rnw file, before you start the plotting:

<<setOptions, echo = FALSE, results = hide>>==
pdf.options(encoding = "CP1250")
@

and later, just do:

<<plotTheFigure, echo = TRUE, fig = TRUE>>==
# I've set echo to TRUE intentionally, to prove my point here
boxplot(mpg ~ cyl, data = mtcars, ylab = "Potrošnja goriva", xlab = "Broj cilindara", main = "Dijagram raspršenja")
@

And the same scenario stands for ggplot2 graphs.

Some of you will get correct output, but I don't! And as I've said before, if you're running Ubuntu, there's a great chance that this will work, but for now, I can't seem to make it alive and kicking in Arch.

And to save your keystrokes, you can download Sweave file, and/or PDF file (executed on the Arch machine). As you can see, localized characters display correctly within the plot function, but get garbled within Sweave. Now, if I try to save graph to PDF file (without Sweaving), I get the correct output.

So, I've solved some issues, but there's a lot of trial-and-error job left to do.

Please run .Rnw file on your machine, and give me some feedback. To ease things up, I've created Rscript that collects your system info (not personal info) that I find relevant in this case: here' the source, and here's my output.

like image 157
aL3xa Avatar answered Sep 22 '22 06:09

aL3xa