Suppose I've got the following data frame :
d <- data.frame(id=c(1,1,1,2,2,3,3,3), time=c(1,2,3,1,2,1,2,3), var=runif(8))
d
id time var
1 1 1 0.3733586
2 1 2 0.5743769
3 1 3 0.8253280
4 2 1 0.8136957
5 2 2 0.8726963
6 3 1 0.1105549
7 3 2 0.9527002
8 3 3 0.5690021
With the base reshape
function, I can transform it to a "wide" format by specifying a ìdvar
(which identifies rows belonging to the same unit) and a timevar
(which identifies different observations of the same unit) :
reshape(d, idvar="id", timevar="time", direction="wide")
id var.1 var.2 var.3
1 1 0.3733586 0.5743769 0.8253280
4 2 0.8136957 0.8726963 NA
6 3 0.1105549 0.9527002 0.5690021
I've tried to do it with the dcast
function of reshape2
, but didn't find a way. Do you know if it is possible ?
EDIT : Ananda Mahto's comment and answer are perfectly right, the real question was to cast the original data frame when it has several var
columns. My example was not appropriate, sorry.
Reshape2 is a package that allows us to easily transform our data into whatever structure we may need. Many of us are used to seeing our data structured so that corresponds to a single participant and each column corresponds to a variable. This type of data structure is known as wide format.
The melt() function in R programming is an in-built function. It enables us to reshape and elongate the data frames in a user-defined manner. It organizes the data values in a long data frame format.
To convert long data back into a wide format, we can use the cast function. There are many cast functions, but we will use the dcast function because it is used for data frames.
Doesn't the following work?
dcast(d, id ~ time)
# Using var as value column: use value.var to override.
# id 1 2 3
# 1 1 0.2869739 0.59591690 0.8989719
# 2 2 0.4533770 0.14741778 NA
# 3 3 0.1286770 0.02465634 0.7363114
## OR, to get rid of the message:
## dcast(d, id ~ time, value.var = "var")
I suspect, though, that you're asking a little bit different question (as mentioned in my comment). In particular, what if you were starting with:
set.seed(1)
d <- data.frame(id = c(1,1,1,2,2,3,3,3),
time = c(1,2,3,1,2,1,2,3),
var1 = runif(8),
var2 = runif(8))
With base R's reshape
, it's just one line:
reshape(d, direction = "wide", idvar = "id", timevar = "time")
# id var1.1 var2.1 var1.2 var2.2 var1.3 var2.3
# 1 1 0.2655087 0.6291140 0.3721239 0.06178627 0.5728534 0.2059746
# 4 2 0.9082078 0.1765568 0.2016819 0.68702285 NA NA
# 6 3 0.8983897 0.3841037 0.9446753 0.76984142 0.6607978 0.4976992
Let's try the same with dcast
from "reshape2". Here's the approach we might be tempted to take:
library(reshape2)
dcast(d, id ~ time)
# Using var2 as value column: use value.var to override.
# id 1 2 3
# 1 1 0.6291140 0.06178627 0.2059746
# 2 2 0.1765568 0.68702285 NA
# 3 3 0.3841037 0.76984142 0.4976992
But that doesn't work because dcast
expects a single value.var
. So, we need to melt
the data again.
d2 <- melt(d, id.vars = c("id", "time"))
head(d2)
# id time variable value
# 1 1 1 var1 0.2655087
# 2 1 2 var1 0.3721239
# 3 1 3 var1 0.5728534
# 4 2 1 var1 0.9082078
# 5 2 2 var1 0.2016819
# 6 3 1 var1 0.8983897
Now, we can use dcast
quite easily.
dcast(d2, id ~ variable + time)
# id var1_1 var1_2 var1_3 var2_1 var2_2 var2_3
# 1 1 0.2655087 0.3721239 0.5728534 0.6291140 0.06178627 0.2059746
# 2 2 0.9082078 0.2016819 NA 0.1765568 0.68702285 NA
# 3 3 0.8983897 0.9446753 0.6607978 0.3841037 0.76984142 0.4976992
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With