Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to format a complex table for rmarkdown PDF output

I have a table that I'd like to output in PDF format from an rmarkdown document. However, with my limited Latex skills, I can't figure out how to get spanning columns, cell borders, and font faces the way I want them using xtable with various Latex additions.

I was able to get pretty much what I wanted using the FlexTable function from the ReporteRs package, but it looks like FlexTable can only be used with rmarkdown to produce html output, but not PDF output.

So, I'm looking for help with formatting my table using xtable, or any other R package or (possibly custom) R function that can be used for programmatically creating reasonably complex tables for PDF output. Also, if there's some way to coax FlexTable to work with PDF output, that would be great too.

Below I create a table using FlexTable so you can see what I'm aiming for. Following that, I provide a sample rmarkdown document showing where I've gotten so far in my (somewhat lame) efforts to create a similar table using xtable.

ReporteRs::FlexTable version

First, let's create the data that will go into the table:

library(ReporteRs)

x = structure(c(34L, 6L, 9L, 35L), .Dim = c(2L, 2L), .Dimnames = structure(list(
    Actual = c("Fail", "Pass"), Predicted = c("Fail", "Pass")), .Names = c("Actual", 
"Predicted")), class = "table")

x=cbind(x, prop.table(x), prop.table(x, 1), prop.table(x,2))
x[, -c(1,2)] = sapply(x[,-c(1,2)], function(i) paste0(sprintf("%1.1f", i*100),"%"))
x = cbind(Actual=rownames(x), x)

Now for the FlexTable creation and formatting:

# Set up general table properties and formatting
cell_p = cellProperties(padding.right=3, padding.left=3)
par_p = parProperties(text.align="right")

# Create table
ft = FlexTable(x, header.columns=FALSE, body.cell.props=cell_p, body.par.props=par_p)

# Add three header rows
ft = addHeaderRow(ft, text.properties=textBold(), c("","Predicted"),
                  colspan=c(1,8), par.properties=parCenter())

ft = addHeaderRow(ft, text.properties=textBold(), 
                  value=c("", "Count", "Overall\nPercent", "Row\nPercent", "Column\nPercent"),
                  colspan=c(1,rep(2,4)), par.properties=parCenter())

ft = addHeaderRow(ft, text.properties=textItalic(), par.properties=parCenter(),
                  value=colnames(x))

# Format specific cells
ft[1:2, 1, to="header", side="left"] = borderProperties(color="white")
ft[1:2, 1, to="header", side="top"] = borderProperties(color="white")

ft[3, 1, to="header"] = textProperties(font.style="normal", font.weight="bold")
ft[ , 1] = textProperties(font.style="italic")

ft[ , 2:3] = cellProperties(padding.right=7, padding.left=7)
ft[ , 1] = cellProperties(padding.right=10, padding.left=10)

# Display ft
ft

And here's what the final table looks like (this is a PNG screenshot of the table displayed in a browser window):

enter image description here

Now for my attempt at doing the same thing with xtable.

xtable version

Here's the rmarkdown document and the header.tex file:

---
title: "Untitled"
author: "eipi10"
date: "11/19/2016"
output: 
  pdf_document:
    fig_caption: yes
    includes:
      in_header: header.tex 
---

```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(echo = FALSE, message=FALSE)
```

```{r}
# Fake confusion matrix to work with
x = structure(c(34L, 6L, 9L, 35L), .Dim = c(2L, 2L), .Dimnames = structure(list(
    Actual = c("Fail", "Pass"), Predicted = c("Fail", "Pass")), .Names = c("Actual", 
"Predicted")), class = "table")

x=cbind(x, prop.table(x), prop.table(x, 1), prop.table(x,2))
x[, -c(1,2)] = sapply(x[,-c(1,2)], function(i) paste0(sprintf("%1.1f", i*100),"%"))
x = cbind(Actual=rownames(x), x)
```  

```{r use_xtable, results="asis"}
# Output the confusion matrix created above as a latex table
library(xtable)
options(xtable.comment=FALSE)

# This is a modified version of a function created in the following SO answer:
# http://stackoverflow.com/a/38978541/496488
make_addtorow <- function(row.name, terms, colSpan, width) {
  # Custom row function
  paste0(row.name, 
  paste0('& \\multicolumn{', colSpan, '}{C{', width, 'cm}}{', 
         terms, 
         '}', 
        collapse=''), 
  '\\\\')
}

addtorow <- list()
addtorow$pos <- list(-1,-1,-1,-1) 
addtorow$command <- c(
  "\\hline",
  make_addtorow("", c("Predicted"), 8, 12),
  "\\hline",
  make_addtorow("", c("Count", "Percent", "Row Percent", "Column Percent"), 2, 3)
  )

xtbl = xtable(x, caption="Created with xtable")

align(xtbl) <- c("|L{0cm}|", "L{1.2cm}|", rep("R{1cm}|",8))

print(xtbl, 
      include.rownames=FALSE, 
      tabular.environment="tabularx", 
      width="0.92\\textwidth",
      add.to.row = addtorow)
```

File header.tex that is used for knitting the rmarkdown document above:

% xtable manual: https://cran.r-project.org/web/packages/xtable/vignettes/xtableGallery.pdf
\usepackage{array}
\usepackage{tabularx}  
\newcolumntype{L}[1]{>{\raggedright\let\newline\\
\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{C}[1]{>{\centering\let\newline\\
\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{R}[1]{>{\raggedleft\let\newline\\
\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{P}[1]{>{\raggedright\tabularxbackslash}p{#1}}

% Caption on top
% http://tex.stackexchange.com/a/14862/4762
\usepackage{floatrow}
\floatsetup[figure]{capposition=top}

And here's what the table looks like in the PDF output:

enter image description here

like image 272
eipi10 Avatar asked Nov 20 '16 00:11

eipi10


People also ask

How do I use R Markdown in PDF?

To transform your markdown file into an HTML, PDF, or Word document, click the “Knit” icon that appears above your file in the scripts editor. A drop down menu will let you select the type of output that you want. When you click the button, rmarkdown will duplicate your text in the new file format.

How do I embed a table in R Markdown?

To use it, open a Rmd or R document and select “Addins –> Insert Table”.

What is knitr :: Kable?

The kable() function in knitr is a very simple table generator, and is simple by design. It only generates tables for strictly rectangular data such as matrices and data frames.


2 Answers

Quoting this comment:

I'm looking for a way to do this programmatically from within the rmarkdown document without having to hard-code the formatting, so that it's reproducible and flexible.

The following solution uses a hard-coded "template", but the template can be filled with any data (provided it has the same 2x8 structure).

The generated table looks like this:

Output

Full code below.


Basically, the final table consists of 9 columns, so the basic LaTeX structure is

\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
% rest of table
\end{tabular}

However, it is convenient to fix the width of the cells. This is possible with the custom column type C (taken from here on TEX.SE), which allows for centered content with fixed width. This, together with the more compact syntax for repeating column types gives:

\begin{tabular}{|c *{8}{|C{1cm}}|}
% rest of table
\end{tabular}

(First column centered with flexible width, then 8 centered columns, each 1cm wide).

The cells spanning multiple columns are possible using \multicolumn. These cells should also have a fixed width in order to have the cell captions break into two lines. Note that it is a fallacy to assume that the cells spanning two 1cm-columns should have a width of 2cm because the two spanned cells have additional padding between them. Some measurement revealed that about 2.436cm delivers good results.

Remark on the first column: Although \multicolumn{1}{...}{...} looks useless at first sight, it is useful for changing the column type (including left/right) borders for a single cell. I used it to drop the leftmost vertical line in the first two rows.

\cline{x-y} provides horizontal lines that span only the columns xto y.

Taking these pieces together gives:

\begin{tabular}{|c *{8}{|C{1cm}}|} \cline{2-9}
    \multicolumn{1}{c|}{} & \multicolumn{8}{c|}{\textbf{Predicted}} \\ \cline{2-9}
    \multicolumn{1}{c|}{} & \multicolumn{2}{c|}{\textbf{Count}} & \multicolumn{2}{C{2.436cm}|}{\textbf{Overall Percent}} & \multicolumn{2}{C{2.436cm}|}{\textbf{Row \newline Percent}} & \multicolumn{2}{C{2.436cm}|}{\textbf{Column Percent}} \\ \hline
% rest of table
\end{tabular}

Regarding the data, I dropped the last line of the code that generated to sample data to get:

> x <- structure(c(34L, 6L, 9L, 35L), .Dim = c(2L, 2L), .Dimnames = structure(list(Actual = c("Fail", "Pass"), Predicted = c("Fail", "Pass")), .Names = c("Actual", "Predicted")), class = "table")
> x <- cbind(x, prop.table(x), prop.table(x, 1), prop.table(x,2))
> x[, -c(1,2)] <- sapply(x[,-c(1,2)], function(i) paste0(sprintf("%1.1f", i*100),"%"))
> x
     Fail Pass Fail    Pass    Fail    Pass    Fail    Pass   
Fail "34" "9"  "40.5%" "10.7%" "79.1%" "20.9%" "85.0%" "20.5%"
Pass "6"  "35" "7.1%"  "41.7%" "14.6%" "85.4%" "15.0%" "79.5%"

To set the column and row names in italics, apply

colnames(x) <- sprintf("\\emph{%s}", colnames(x)) # highlight colnames
rownames(x) <- sprintf("\\emph{%s}", rownames(x)) # highlight rownames

Then, the following xtable code can be used:

print(xtable(x),
      only.contents = TRUE, 
      comment = FALSE,
      sanitize.colnames.function = identity, 
      sanitize.rownames.function = identity, 
      hline.after = 0:2)

The argument only.contents suppresses the enclosing tabular environment. Assigning the identity function to sanitize.colnames.function and sanitize.rownames.function means "don't sanitize". We need this because column and row names contain special LaTeX characters that should not be escaped (\emph).

The output should replace the %rest of table placeholder from above.


Conceptually, the code uses xtable to generate only the table body but not the header because it is much easier to write the header manually.

Although the whole table header is "hard-coded", the data can be changed as required.

Don't forget to escape all \ with a second \! Also, the following must be added to the header (header.tex):

\usepackage{array}
\newcolumntype{C}[1]{>{\centering\let\newline\\\arraybackslash\hspace{0pt}}m{#1}} % https://tex.stackexchange.com/a/12712/37118

I wrapped all the elements outlined above in a function PrintConfusionMatrix that can be reused with any 2x8 data frame providing the data and column / row names.


Full code:

---
output:
  pdf_document: 
    keep_tex: yes
    includes:
      in_header: header.tex
---


```{r, echo = FALSE}
library(xtable)

# Sample data from question
x <- structure(c(34L, 6L, 9L, 35L), .Dim = c(2L, 2L), .Dimnames = structure(list(Actual = c("Fail", "Pass"), Predicted = c("Fail", "Pass")), .Names = c("Actual", "Predicted")), class = "table")
x <- cbind(x, prop.table(x), prop.table(x, 1), prop.table(x,2))
x[, -c(1,2)] <- sapply(x[,-c(1,2)], function(i) paste0(sprintf("%1.1f", i*100),"%"))
#x <- cbind(Actual=rownames(x), x) # dropped; better not to add row names to data

PrintConfusionMatrix <- function(data, ...) {

  stopifnot(all(dim(x) == c(2, 8)))

  colnames(x) <- sprintf("\\emph{%s}", colnames(x)) # highlight colnames
  rownames(x) <- sprintf("\\emph{%s}", rownames(x)) # highlight rownames

  cat('\\begin{tabular}{|c *{8}{|C{1cm}}|} \\cline{2-9}
    \\multicolumn{1}{c|}{} & \\multicolumn{8}{c|}{\\textbf{Predicted}} \\\\ \\cline{2-9}
    \\multicolumn{1}{c|}{} & \\multicolumn{2}{c|}{\\textbf{Count}} & \\multicolumn{2}{C{2.436cm}|}{\\textbf{Overall Percent}} & \\multicolumn{2}{C{2.436cm}|}{\\textbf{Row \\newline Percent}} & \\multicolumn{2}{C{2.436cm}|}{\\textbf{Column Percent}} \\\\ \\hline
    \\textbf{Actual} ')

  print(xtable(x),
        only.contents = TRUE, 
        comment = FALSE,
        sanitize.colnames.function = identity, 
        sanitize.rownames.function = identity, 
        hline.after = 0:2,
        ...)
  cat("\\end{tabular}")
}
```

```{r, results='asis'}
PrintConfusionMatrix(x)
```
like image 92
CL. Avatar answered Oct 06 '22 01:10

CL.


This is very simple to do using the add_header_above command from the KableExtra-package. You can add as many column groupings as you want. Here is what I would do:

d <- mtcars[1:5,1:5]
kable(d,longtable = T, booktabs = T) %>%
   add_header_above(c(" ", "Group 1" = 2, "Group 2" = 3)) %>%
   add_header_above(c("","Groups" = 5))

enter image description here

like image 45
Ruben Berge Mathisen Avatar answered Oct 06 '22 02:10

Ruben Berge Mathisen