Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dcast without ID variables

In the "An Introduction to reshape2" package Sean C. Anderson presents the following example.

He uses the airquality data and renames the column names

names(airquality) <- tolower(names(airquality))

The data look like

#   ozone solar.r wind temp month day
# 1    41     190  7.4   67     5   1
# 2    36     118  8.0   72     5   2
# 3    12     149 12.6   74     5   3
# 4    18     313 11.5   62     5   4
# 5    NA      NA 14.3   56     5   5
# 6    28      NA 14.9   66     5   6

Then he melts them by

aql <- melt(airquality, id.vars = c("month", "day"))

to get

#   month day variable value
# 1     5   1    ozone    41
# 2     5   2    ozone    36
# 3     5   3    ozone    12
# 4     5   4    ozone    18
# 5     5   5    ozone    NA
# 6     5   6    ozone    28

Finally he gets the original one (different column order) by

aqw <- dcast(aql, month + day ~ variable)

My Quesiton

Assume now that we do not have ID variables (i.e. month and day) and have melted the data as follows

aql <- melt(airquality)

which look like

#   variable value
# 1    ozone    41
# 2    ozone    36
# 3    ozone    12
# 4    ozone    18
# 5    ozone    NA
# 6    ozone    28

My question is how can I get the original ones? The original ones would correspond to

#   ozone solar.r wind temp 
# 1    41     190  7.4   67 
# 2    36     118  8.0   72 
# 3    12     149 12.6   74
# 4    18     313 11.5   62 
# 5    NA      NA 14.3   56
# 6    28      NA 14.9   66
like image 221
conighion Avatar asked Jul 06 '15 04:07

conighion


2 Answers

Another option is unstack

out <- unstack(aql,value~variable)
head(out)
#   ozone solar.r wind temp month day
#1    41     190  7.4   67     5   1
#2    36     118  8.0   72     5   2
#3    12     149 12.6   74     5   3
#4    18     313 11.5   62     5   4
#5    NA      NA 14.3   56     5   5
#6    28      NA 14.9   66     5   6

As the question is about dcast, we can create a sequence column and then use dcast

aql$indx <- with(aql, ave(seq_along(variable), variable, FUN=seq_along))
out1 <- dcast(aql, indx~variable, value.var='value')[,-1]
head(out1)
#   ozone solar.r wind temp month day
#1    41     190  7.4   67     5   1
#2    36     118  8.0   72     5   2
#3    12     149 12.6   74     5   3
#4    18     313 11.5   62     5   4
#5    NA      NA 14.3   56     5   5
#6    28      NA 14.9   66     5   6

If you are using data.table, the devel version of data.table ie. v1.9.5 also has dcast function. Instructions to install the devel version are here

 library(data.table)#v1.9.5+
 setDT(aql)[, indx:=1:.N, variable]
 dcast(aql, indx~variable, value.var='value')[,-1]
like image 91
akrun Avatar answered Sep 30 '22 15:09

akrun


One option using split,

out <- data.frame(sapply(split(aql, aql$variable), `[[`, 2))

Here, the data is split by the variable column, then the second column of each group is combined back into a data frame (the [[ function with the argument 2 is passed to sapply)

head(out)
#   Ozone Solar.R Wind Temp Month Day
# 1    41     190  7.4   67     5   1
# 2    36     118  8.0   72     5   2
# 3    12     149 12.6   74     5   3
# 4    18     313 11.5   62     5   4
# 5    NA      NA 14.3   56     5   5
# 6    28      NA 14.9   66     5   6
like image 33
Rorschach Avatar answered Sep 30 '22 15:09

Rorschach