Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create summary table of categorical variables of different lengths

In SPSS it is fairly easy to create a summary table of categorical variables using "Custom Tables":

This example is from SPSS

How can I do this in R?

General and expandable solutions are preferred, and solutions using the Plyr and/or Reshape2 packages, because I am trying to learn those.

Example Data: (mtcars is in the R installation)

df <- colwise(function(x) as.factor(x) ) (mtcars[,8:11])

P.S.

Please note, my goal is to get everything in one table like in the picture. I have been strugling for many hours but my attempts have been so poor that posting the code probably won't add to the comprehensibility of the question.

like image 682
Rene Bern Avatar asked Feb 27 '13 12:02

Rene Bern


People also ask

How is categorical data summarized?

Calculating Proportions Proportions are often used to summarize categorical data and can be calculated by dividing individual frequencies by the total number of responses.

How do you compare categorical variables between two groups?

The Pearson's χ2 test is the most commonly used test for assessing difference in distribution of a categorical variable between two or more independent groups. If the groups are ordered in some manner, the χ2 test for trend should be used.

What are the 4 types of categorical variables?

There are three types of categorical variables: binary, nominal, and ordinal variables.


2 Answers

One way to get the output, but not the formatting:

library(plyr)
ldply(mtcars[,8:11],function(x) t(rbind(names(table(x)),table(x),paste0(prop.table(table(x))*100,"%"))))
    .id 1  2       3
1    vs 0 18  56.25%
2    vs 1 14  43.75%
3    am 0 19 59.375%
4    am 1 13 40.625%
5  gear 3 15 46.875%
6  gear 4 12   37.5%
7  gear 5  5 15.625%
8  carb 1  7 21.875%
9  carb 2 10  31.25%
10 carb 3  3  9.375%
11 carb 4 10  31.25%
12 carb 6  1  3.125%
13 carb 8  1  3.125%
like image 124
James Avatar answered Sep 28 '22 02:09

James


A base R solution using lapply() and do.call() with rbind() to stitch together the pieces:

x <- lapply(mtcars[, c("vs", "am", "gear", "carb")], table)

neat.table <- function(x, name){
  xx <- data.frame(x)
  names(xx) <- c("Value", "Count")
  xx$Fraction <- with(xx, Count/sum(Count))
  data.frame(Variable = name, xx)
}

do.call(rbind, lapply(seq_along(x), function(i)neat.table(x[i], names(x[i]))))

Results in:

   Variable Value Count Fraction
1        vs     0    18  0.56250
2        vs     1    14  0.43750
3        am     0    19  0.59375
4        am     1    13  0.40625
5      gear     3    15  0.46875
6      gear     4    12  0.37500
7      gear     5     5  0.15625
8      carb     1     7  0.21875
9      carb     2    10  0.31250
10     carb     3     3  0.09375
11     carb     4    10  0.31250
12     carb     6     1  0.03125
13     carb     8     1  0.03125

The rest is formatting.

like image 44
Andrie Avatar answered Sep 28 '22 01:09

Andrie