Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-english special characters in knitr

I am using knitr 1.1. in R 3.0.0 and within WinEdt (RWinEdt 2.0). I am having problems with knitr recognizing Swedish characters (ä, ö, å). This is not an issue with R; those characters are even recognized in file names, directory names, objects, etc. In Sweave it was not a problem either.

I already have \usepackage[utf8]{inputenc} in my document, but knitr does not seem able to handle the special characters. After running knitr, I get the following message:

Warning in remind_sweave(if (in.file) input) :
It seems you are using the Sweave-specific syntax; you may need Sweave2knitr("deskriptiv 130409.Rnw") to convert it to knitr

processing file: deskriptiv 130409.Rnw

(*) NOTE: I saw chunk options "label=läser_in_data"
please go to http://yihui.name/knitr/options (it is likely that you forgot to 
quote "character" options)

Error in parse(text = str_c("alist(", quote_label(params), ")"), srcfile = NULL) : 
1:15: unexpected input
1: alist(label=lä
                 ^
Calls: knit ... parse_params -> withCallingHandlers -> eval -> parse
Execution halted

The particular label it complains about is label=läser. Changing the label is not enough, since knitr even complains if R objects use äåö.

I used Sweave2knitr() since the file originally was created for Sweave, but the result was not better: now all äåö have been transformed to äpåö, both in the R chunks and in the latex text, and knitr still gives an error message.

Session info:

R version 3.0.0 (2013-04-03)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Swedish_Sweden.1252  LC_CTYPE=Swedish_Sweden.1252           LC_MONETARY=Swedish_Sweden.1252
[4] LC_NUMERIC=C                    LC_TIME=Swedish_Sweden.1252    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] knitr_1.1
loaded via a namespace (and not attached):
[1] digest_0.6.3   evaluate_0.4.3 formatR_0.7    stringr_0.6.2  tools_3.0.0   

As I mentioned there are file names and objects with Swedish characters (since that has not been a problem before), and also the text needs to be in Swedish.

Thank you for any help in getting knitr to work outside of English.

like image 868
user2266041 Avatar asked Apr 10 '13 13:04

user2266041


People also ask

How do you type special characters in R markdown?

5.2 Special characters If you want any special characters in R Markdown, LaTeX, or pandoc to appear as text, rather than having them perform some function, you need to “escape” them with a backslash. For example, pound signs/hashtags, backslashes, and dollar signs need to be preceded by a backslash.

What does knitr :: Opts_chunk set echo true mean?

The first code chunk: ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` is used to specify any global settings to be applied to the R Markdown script. The example sets all code chunks as “echo=TRUE”, meaning they will be included in the final rendered version.

How do you use the knitr in R studio?

If you are using RStudio, then the “Knit” button (Ctrl+Shift+K) will render the document and display a preview of it.

What is knitr R markdown?

RMarkdown is an extension to markdown which includes the ability to embed code chunks and several other extensions useful for writing technical reports. The rmarkdown package extends the knitr package to, in one step, allow conversion between an RMarkdown file (.Rmd) into PDF, HTML, word document, amongst others.


2 Answers

I think you have to contact the maintainer of the R-Sweave mode in WinEdt if you are using this mode to call knitr. The issue is WinEdt has to pass the encoding of the file to knit() if you are not using the native encoding of your OS. You mentioned UTF-8 but that is not the native encoding for Windows, so you must not use \usepackage[utf8]{inputenc} unless you are sure your file is UTF8-encoded.

There are several problems mixed up here, and it is unlikely to solve them all with a single answer.

The first problem is label=läser, which really should be label='läser', i.e. you must quote all the chunk labels (check other labels in the document as well); knitr tries to automatically quote your labels when you write <<foo>>= (it is turned to <<'foo'>>=), but this does not work when you use <<label=foo>>= (you have to write <<label='foo'>>= explicitly). But this problem is perhaps not essential here.

I think the real problem here is the file encoding (which is nasty under Windows). You seem to be using UTF-8 under a system that does not respect UTF-8 by default. In this case you have call knit('yourfile.Rnw', encoding = 'UTF-8'), i.e. pass the encoding to knit(). I do not use WinEdt, so I have no idea how to do that. You can hard-code the encoding in the configurations, but that is not recommended.

Two suggestions:

  1. do not use UTF-8 under Windows; use your system native encoding (Windows-1252, I guess) instead;
  2. or use RStudio instead of WinEdt, which can pass the encoding to knitr;

BTW, since Sweave2knitr() was popped up, there must be other problems in your Rnw document. To diagnose the problem, there are two ways to go:

  1. if you use UTF-8, run Sweave2knitr('deskriptiv 130409.Rnw', encoding = 'UTF-8')
  2. if you use the native encoding of your OS, just run Sweave2knitr('deskriptiv 130409.Rnw')

Please read the documentation if you have questions about the diagnostic information printed out by Sweave2knitr().

like image 177
Yihui Xie Avatar answered Oct 20 '22 22:10

Yihui Xie


R-Sweave invokes knitr through the knitr.edt macro, which itself uses the code in knitrSweave.R to launch knit. The knitcommand in this later script is near the top and reads res <- knit(filename).

Following Yihui's suggestion, you can try to replace this command with

res <- knit(filename, encoding = 'UTF-8')

The knitr.edt and knitrSweave.R files should be in your %b\Contrib\R-Sweave folder, where %b is your winEdt user folder (something like "C:\Users\userA\AppData\Roaming\WinEdt Team\WinEdt 7" under Win 7).

Currently, I do not know how we could pass the encoding as an argument to avoid this hard coding solution.

I would suggest to avoid extended characters in file names which can only be sources of problems. Personally, I never use such names.

like image 2
Gilbert Avatar answered Oct 20 '22 21:10

Gilbert