Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Relationship between R Markdown, Knitr, Pandoc, and Bookdown

What is the relationship between the functionality of R Markdown, Knitr, Pandoc, and Bookdown?

Specifically what is the 'division of labour' between these packages in converting markup documents with embedded R code (e.g. .Rnw or .Rmd) into final outputs (e.g. .pdf or .html)? And if Knitr is used to process RMarkdown, what does the rmarkdown package do and how is it different to the markdown package?

like image 309
RobinL Avatar asked Nov 12 '16 13:11

RobinL


1 Answers

Pandoc

Pandoc is a document converter. It can convert from a number of different markup formats to many other formats, such as .doc, .pdf etc.

Pandoc is a command line tool with no GUI. It is an independent piece of software, separate from R. However, it comes bundled with R Studio because rmarkdown relies on it for document conversion.

Pandoc not only converts documents, but it also adds functionality on top of the base markdown language to enable it to support more complex outputs.

R Markdown

R Markdown is based on markdown:

Markdown (markup language)

Markdown is a lightweight markup language with plain text formatting syntax designed so that it can be converted to HTML and many other formats. A markdown file is a plain text file that is typically given the extension .md.

Like other markup languages like HTML and Latex, it is completely independent from R.

There is no clearly defined Markdown standard. This has led to fragmentation as different vendors write their own variants of the language to correct flaws or add missing features.

Markdown (R package)

markdown is an R package which converts .Rmd files into HTML. It is the predecessor of rmarkdown, which offers much more functionality. It is no longer recommended for use.

R Markdown (markup language)

R Markdown is an extension of the markdown syntax. R Markdown files are plain text files that typically have the file extension .Rmd. They are written using an extension of markdown syntax that enables R code to be embedded in them in a way which can later be executed.

Because they are expected to be processed by the rmarkdown package, it is possible to use Pandoc markdown syntax as part of a R markdown file. This is an extension to the original markdown syntax that provides additional functionality like raw HTML/Latex and tables.

R Markdown (package)

The R package rmarkdown is a library which proceses and converts .Rmd files into a number of different formats.

The core function is rmarkdown::render which stands on the shoulders of pandoc. This function 'renders the input file to the specified output format using pandoc. If the input requires knitting then knitr::knit is called prior to pandoc.

The RMarkdown package's aim is simply to provide reasonably good defaults and an R-friendly interface to customize Pandoc options..

The YAML metadata seen at the top of RMarkdown files is specificially to pass options to rmarkdown::render, to guide the build process.

Note that RMarkdown only deals with markdown syntax. If you want to convert a .Rhtml or a .Rnw file, you should use the convenience functions built into Knitr, such as knitr::knit2html and knitr:knit2pdf

Knitr

Knitr takes a plain text document with embedded code, executes the code and 'knits' the results back into the document.

For for example, it converts

  • An R Markdown (.Rmd) file into a standard markdown file (.md)
  • An .Rnw (Sweave) file into to .tex format.
  • An .Rhtml file into to html.

The core function is knitr::knit and by default this will look at the input document and try and guess what type it is - Rnw, Rmd etc.

This core function performs three roles: - A source parser, which looks at the input document and detects which parts are code that the user wants to be evaluated. - A code evaluator, which evaluates this code - An output renderer, which writes the results of evaluation back to the document in a format which is interpretable by the raw output type. For instance, if the input file is an .Rmd, the output render marks up the output of code evaluation in .md format.

Converting between document formats

Knitr does not convert between document formats - such as converting a .md into a .html. It does, however, provide some convenience functions to help you use other libraries to do this. If you are using the rmarkdown package, you should ignore this functionality because it has been superceded by rmarkdown::render.

An example is knitr:knit2pdf which will: 'Knit the input Rnw or Rrst document, and compile to PDF using texi2pdf or rst2pdf'.

A potential source of confusion is knitr::knit2html, which "is a convenience function to knit the input markdown source and call markdown::markdownToHTML to convert the result to HTML." This is now legacy functionality because the markdown package has been superceded by the rmarkdown package. See this note.

Bookdown

The bookdown package is built on top of R Markdown, and inherits the simplicity of the Markdown syntax , as well as the possibility of multiple types of output formats (PDF/HTML/Word/…).

It offers features like multi-page HTML output, numbering and cross-referencing figures/tables/sections/equations, inserting parts/appendices, and imported the GitBook style (https://www.gitbook.com) to create elegant and appealing HTML book pages.

like image 79
RobinL Avatar answered Sep 22 '22 19:09

RobinL