Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spread columns by count in R dplyr

I have a factor column. I would like to spread into one column for each factor and then fill the gaps by the count of that factor appears for each id. Suppose we have:

car <- c("a","b","b","b","c","c","a","b","b","b","c","c")
type <- c("good", "regular", "bad","good", "regular", "bad","good", "regular", "bad","good", "regular", "bad")
car_type <- data.frame(car,type)

and get:

   car    type
1    a    good
2    b regular
3    b     bad
4    b    good
5    c regular
6    c     bad
7    a    good
8    b regular
9    b     bad
10   b    good
11   c regular
12   c     bad

I want this:

> results
  car good regular bad
1   a    2       0   0
2   b    2       2   2
3   c    0       2   2

I try this using dplyr, but I'm not really use to it, so It doesn't work.

car_type %>%
  select(car, type) %>%
  group_by(car) %>%
  mutate(seq = unique(type)) %>%
  spread(seq, type)

I would thanks any help.

like image 958
Xbel Avatar asked Nov 06 '16 20:11

Xbel


People also ask

What is spread () in R?

The spread() function can be used to create four additional columns from the stat column's four distinct values. library(tidyr) library(tidyr)

Is count a Dplyr function?

count() function in dplyr can be used to count observations by multiple groups. Here is an example, where we count observations by two variables. We get number of observations for each combinations of the two variables.


2 Answers

Update for tidyr::pivot_wider:

library(tidyverse)

car_type %>% 
  count(car, type) %>% 
  pivot_wider(names_from=type, values_from=n, values_fill=0)

Original Answer

With reshape2:

library(reshape2)

dcast(car_type, car ~ type)

If you were going to use dplyr, the code would be:

dplyr and reshape2

car_type %>% count(car, type) %>%
  dcast(car ~ type, fill=0)

dplyr and tidyr

car_type %>% count(car, type) %>%
  spread(type, n, fill=0)

In each case, count(car, type) is equivalent to

group_by(car, type) %>% tally

or

group_by(car, type) %>% summarise(n=n())

With data.table

library(data.table)

dcast(setDT(car_type), car ~ type, fill=0)
like image 191
eipi10 Avatar answered Oct 05 '22 04:10

eipi10


Try this in base R:

xtabs(~car+type, car_type)

#   type
#car bad good regular
#  a   0    2       0
#  b   2    2       2
#  c   2    0       2

OR

table(car_type)
like image 45
989 Avatar answered Oct 05 '22 05:10

989