Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xtable for conditional cell formatting significant p-values of table

I'm using xtable to generate tables to put in Latex, and was wondering if there's a way to have conditional formatting of cells so that all significant p-values are in grey? I'm using Knitr in TexShop.

Here's an example using the diamonds data in ggplot2, and running a TukeyHSD test to predict carat from cut.

library(ggplot2)
library(xtable)
summary(data.aov <- aov(carat~cut, data = diamonds))
data.hsd<-TukeyHSD(data.aov)
data.hsd.result<-data.frame(data.hsd$cut)
data.hsd.result

I can then get data.hsd.result into xtable format with:

xtable(data.hsd.result)

In Latex, the output looks like this:

                         diff         lwr         upr        p.adj
Good-Fair         -0.19695197 -0.23342631 -0.16047764 0.000000e+00
Very Good-Fair    -0.23975525 -0.27344709 -0.20606342 0.000000e+00
Premium-Fair      -0.15418175 -0.18762721 -0.12073628 0.000000e+00
Ideal-Fair        -0.34329965 -0.37610961 -0.31048970 0.000000e+00
Very Good-Good    -0.04280328 -0.06430194 -0.02130461 5.585171e-07
Premium-Good       0.04277023  0.02165976  0.06388070 3.256208e-07
Ideal-Good        -0.14634768 -0.16643613 -0.12625923 0.000000e+00
Premium-Very Good  0.08557350  0.06974902  0.10139799 0.000000e+00
Ideal-Very Good   -0.10354440 -0.11797729 -0.08911151 0.000000e+00
Ideal-Premium     -0.18911791 -0.20296592 -0.17526989 0.000000e+00

It it possible to have any p-values < 0.05 to have a grey coloured background automatically or highlighted in some way? Obviously, for this set it would be the whole column, but I'm hoping for something that works with all my data.

like image 913
dmt Avatar asked Jul 03 '14 12:07

dmt


2 Answers

Hello try this :

\documentclass{article}
\usepackage{color}
\begin{document}

<<echo=FALSE, results='asis'>>=
df = data.frame(V1 = LETTERS[1:6], V2 = runif(6, 0, 1))
df$V3 = ifelse(df$V2 < 0.5, paste0("\\colorbox{red}{", df$V2, "}"), df$V2)
library(xtable)
print(xtable(df), sanitize.text.function = function(x) x)
@

\end{document}

EDIT

If you have multiple conditions, one solution is to use package dplyr and function case_when :

set.seed(123)
df <- data.frame(V1 = LETTERS[1:6], V2 = runif(6, 0, 1))

library("dplyr")
df %>% 
  mutate(
    V3 = case_when(
      V2 < 0.5 ~ paste0("\\colorbox{red}{", round(V2, 3), "}"),
      V2 >= 0.5 & V2 < 0.8 ~ paste0("\\colorbox{blue}{", round(V2, 3), "}"),
      TRUE ~ formatC(V2, digits = 3)
    )
  )
#   V1        V2                      V3
# 1  A 0.2875775  \\colorbox{red}{0.288}
# 2  B 0.7883051 \\colorbox{blue}{0.788}
# 3  C 0.4089769  \\colorbox{red}{0.409}
# 4  D 0.8830174                   0.883
# 5  E 0.9404673                    0.94
# 6  F 0.0455565  \\colorbox{red}{0.046}
like image 175
Victorp Avatar answered Sep 29 '22 06:09

Victorp


Victorp provide an excellent solution and it gave me such a relief from a hours long struggle. Then later the day I need to impose more than one conditions on same data set, meaning I need two different colors on cells based on different conditions, to solve this, totally based on Victorp's answer, I figured a solution and hope this would help those need this in the future.

    <<echo=FALSE, results='asis'>>=
    df = data.frame(V1 = LETTERS[1:6], V2 = runif(6, 0, 1),V3 = runif(6, 0, 1))
    ## replicate the data frame of which you are going to highlight the cells
    ## the number of duplicates should be equal to number of conditions you want to impose
    temp.1<-df
    temp.2<-df
    ## impose conditions on those temporary data frame separately.
    ## change the columns you want to 
    for (i in colnames(temp.1)[2:3]) {
    temp.1[,i]= ifelse(temp.1[,i] <= 0.5,
                                paste0("\\colorbox{red}{", temp.1[,i], "}"), temp.1[,i])}
    rm(i)


    for (i in colnames(temp.2)[2]) {
    temp.2[,i]= ifelse(temp.2[,i] > 0.5 & temp.2[,i] <=0.8,
                                paste0("\\colorbox{blue}{", temp.2[,i], "}"),temp.2[,i])}
    rm(i)
    ## then record the position of cells under you conditions
    pos.1<-which(df[,] <=0.5,arr.ind = TRUE)
    pos.2<-which(df[,] >0.5 & df[,]<=0.8,arr.ind = TRUE)
    ## replace cells in original data frame that you want to highlight
    ## replace those values in temp which satisfy the condition imposed on temp.1
    if(length(pos.1)>0) {
      temp[pos.1]<-temp.1[pos.1]
    }


    ## replace those values in temp which satisfy the condition imposed on temp.2
    if(length(pos.2)>0) {
      temp[pos.2]<-temp.2[pos.2]
    }
    rm(temp.1,temp.2,pos.1,pos.2)
    @

then you print df in the way you like. This works, however,given the power of R, I do believe there should be much more easier ways for this.

like image 36
Jason Goal Avatar answered Sep 29 '22 04:09

Jason Goal