I am using Windows 7, R2.15.3 and RStudio 0.97.320 with knitr 1.1. Not sure what my pandoc
version is, but I downloaded it a couple of days ago.
sessionInfo()
R version 2.15.3 (2013-03-01) Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Spanish_Argentina.1252 LC_CTYPE=Spanish_Argentina.1252 LC_MONETARY=Spanish_Argentina.1252
[4] LC_NUMERIC=C LC_TIME=Spanish_Argentina.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.15.3
I would like to get my reports both in html
and Word, so I'm using markdown and pandoc.
I write in spanish with accents on vowels and tildes on the n: á-ú
and ñ
.
I have read many posts and I see problems similar to the one I'm having have been solved with new versions of knitr
. But there is one issue I haven't found a solution for.
When I started, I used the 'system default'
encoding that appears in the RStudio
dialog, i.e. ISO 8859-1
, and the RStudio
previews worked great. However when I tried to get Word documents, pandoc
choked on the accentuated vowels. I found a post showing how to solve this using iconv
:
iconv -t utf-8 "myfile.md" | pandoc -o "myfile.docx"| iconv -f utf-8
While this did solve pandoc's
unrecognized utf-8
characters complaints, for some reason pandoc
stops finding my plots, with an error like this one:
pandoc: Could not find image `figure/Parent.png', skipping...
If I use only non-accent characters, pandoc finds the images with no problems. I looked at the two .md
files with an hex
editor, and I can't see any difference when I compare the sections that handle the figures:![plot of chunk Parent](figure/Parent.png)
although obviously the accentuated characters are completely different... I have verified that the image files do exist in the figure folder
Anyway, after reading many posts I decided to set RStudio
to use UTF-8
encoding. With only one level of files things work great. For example, I can -independently- knit and then pandoc into Word the following 2 Rmd files:
Parent - SAVED WITH utf-8 encoding in RStudio
========================================================
u with an accent: "ú" SAVED WITH utf-8 encoding in RStudio
```{r fig.width=7, fig.height=6}
plot(cars, main='Parent ú')
```
and separately:
Child - SAVED WITH utf-8 encoding in RStudio
========================================================
u with an accent: "ú" Child file
```{r fig.width=7, fig.height=6}
plot(cars, main='One File Child ú')
```
and I get both 2 perfect prevues in RStudio
and 2 perfect Word documents from pandoc
.
The problem arises when I try to call the child part from the parent part. In other words, if I add to the first file the following lines:
```{r CallChild, child='TestUTFChild.Rmd'}
```
then all the accents in the child file become garbled as if the UTF-8
was beeing interpreted as ISO 8859-1
. Pandoc
stops reading the file as well, complaining it's not utf-8
.
If anybody could point me in the right direction, either:
1. With pandoc
not finding the plots if I stay with ISO 8859-1
. I have also tried Windows-1252
because it's what I saw in the sessionInfo
, but the result is the same.
or
2. With the call to the child file, if UTF-8
is the way to go. I have looked for a way of setting some option to force the encoding in the child call, but I haven't found it yet.
Many thanks!
I think this problem should be fixed in the latest development version. See instructions in the development repository on how to install the devel version. Then you should be able to choose UTF-8 in RStudio, and get a UTF-8 encoded output file.
Just in case anyone is interested in the gory details: the reason for the failure before was that I wrote the child output with the encoding you provided, but did not read it with the same encoding. Now I just avoid writing output files for child documents.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With