Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dcast 2 columns

Tags:

r

I have the following data.frame:

group <- sample(c("egyptian", "american", "irish", "australian"), 50, TRUE)
E <- c(rnorm(50, 5, 6))
F <- c(rnorm(50, 7.8, 4.5))
G <- c(rnorm(50, 65, 16.7))
test <- data.frame(group=group, E=E, F=F, G=G)

My goal is to generate a data.frame that includes group as a header and lists its corresponding values in E below.

something like this data.frame:

egyptian <- c(rnorm(50,5,6))
american<- c(rnorm(50,5,6))
irish<- c(rnorm(50,5,6))
australian<- c(rnorm(50,5,6)) 
test <- data.frame(egyptian=egyptian, american=american, 
                   irish=irish, australian=australian)

I tried to subset the 2 columns and then use dcast, but it failed. Is it possible to dcast 2 columns from long to wide?

like image 482
Samehmagd Avatar asked Oct 19 '22 17:10

Samehmagd


1 Answers

As @jbaums mentioned in the comments, the size of each group is not the same.

  table(test$group)
  # american australian   egyptian      irish 
  #   7         18          9         16 

It is also better to set a seed to make it reproducible. i.e.

  set.seed(1)
  group <- sample(c("egyptian", "american", ....)

To transform initial input to the expected output (based on the "E" column), we may need to create a sequence based on the grouping variable ("group")

library(reshape2) 
test$ind <- with(test, ave(seq_along(group), group, FUN=seq_along))
dcast(test, ind~group, value.var='E')

Or another base R option would be to use xtabs

xtabs(E~ind+group, test)

But, note that this will pad '0' for those missing value combinations. For dcast, by default, we will get "NA" for the missing combinations, which we can change by fill argument.

like image 109
akrun Avatar answered Oct 26 '22 23:10

akrun