In the "An Introduction to reshape2" package Sean C. Anderson presents the following example.
He uses the airquality data and renames the column names
names(airquality) <- tolower(names(airquality))
The data look like
# ozone solar.r wind temp month day
# 1 41 190 7.4 67 5 1
# 2 36 118 8.0 72 5 2
# 3 12 149 12.6 74 5 3
# 4 18 313 11.5 62 5 4
# 5 NA NA 14.3 56 5 5
# 6 28 NA 14.9 66 5 6
Then he melts them by
aql <- melt(airquality, id.vars = c("month", "day"))
to get
# month day variable value
# 1 5 1 ozone 41
# 2 5 2 ozone 36
# 3 5 3 ozone 12
# 4 5 4 ozone 18
# 5 5 5 ozone NA
# 6 5 6 ozone 28
Finally he gets the original one (different column order) by
aqw <- dcast(aql, month + day ~ variable)
Assume now that we do not have ID variables (i.e. month and day) and have melted the data as follows
aql <- melt(airquality)
which look like
# variable value
# 1 ozone 41
# 2 ozone 36
# 3 ozone 12
# 4 ozone 18
# 5 ozone NA
# 6 ozone 28
My question is how can I get the original ones? The original ones would correspond to
# ozone solar.r wind temp
# 1 41 190 7.4 67
# 2 36 118 8.0 72
# 3 12 149 12.6 74
# 4 18 313 11.5 62
# 5 NA NA 14.3 56
# 6 28 NA 14.9 66
Another option is unstack
out <- unstack(aql,value~variable)
head(out)
# ozone solar.r wind temp month day
#1 41 190 7.4 67 5 1
#2 36 118 8.0 72 5 2
#3 12 149 12.6 74 5 3
#4 18 313 11.5 62 5 4
#5 NA NA 14.3 56 5 5
#6 28 NA 14.9 66 5 6
As the question is about dcast
, we can create a sequence column and then use dcast
aql$indx <- with(aql, ave(seq_along(variable), variable, FUN=seq_along))
out1 <- dcast(aql, indx~variable, value.var='value')[,-1]
head(out1)
# ozone solar.r wind temp month day
#1 41 190 7.4 67 5 1
#2 36 118 8.0 72 5 2
#3 12 149 12.6 74 5 3
#4 18 313 11.5 62 5 4
#5 NA NA 14.3 56 5 5
#6 28 NA 14.9 66 5 6
If you are using data.table
, the devel version of data.table
ie. v1.9.5
also has dcast
function. Instructions to install the devel version are here
library(data.table)#v1.9.5+
setDT(aql)[, indx:=1:.N, variable]
dcast(aql, indx~variable, value.var='value')[,-1]
One option using split
,
out <- data.frame(sapply(split(aql, aql$variable), `[[`, 2))
Here, the data is split by the variable
column, then the second column of each group is combined back into a data frame (the [[
function with the argument 2
is passed to sapply
)
head(out)
# Ozone Solar.R Wind Temp Month Day
# 1 41 190 7.4 67 5 1
# 2 36 118 8.0 72 5 2
# 3 12 149 12.6 74 5 3
# 4 18 313 11.5 62 5 4
# 5 NA NA 14.3 56 5 5
# 6 28 NA 14.9 66 5 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With