Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Quickest way to summarize number of observations for multiple variables

Tags:

r

dplyr

summary

I am sure this is a super simple thing, but I cannot find a really quick and easy solution.

I have patient data with a lot of columns in a format like this:

patID   disease   category ...
1       1          A
2       0          B
3       1          C
4       1          B

How can I quickly produce a summary table, which includes the number of observations for each column/variable in the dataframe? The result should be something like this:

VARIABLE     Number of rows
disease:1    3
disease:0    1
category:A   1
category:B   2
category:C   1
...

I know I can do this for a single variable by just using table(data$column). But how can I produce something similar for all columns in a dataframe?

like image 565
tholor Avatar asked Feb 09 '23 13:02

tholor


1 Answers

Using tidyr and dplyr:

gather(data, variable, value, -patID) %>%
  count(variable, value)

(Thanks @Frank for reminding me about tally and count.)

like image 160
Nick Kennedy Avatar answered Feb 12 '23 03:02

Nick Kennedy