I have a data frame that I melted using the reshape package that I would like to "un melt". here is a toy example of the melted data (real data frame is 500x100 or larger) : <pre class="prettyprint"><code>variable<-c(rep("X1",3),rep("X2",3),rep("X3",3)) value<-c(rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3)) dat <-data.frame(variable,value) dat variable value 1 X1 0.5285376 2 X1 0.5285376 3 X1 0.5285376 4 X2 0.1694908 5 X2 0.1694908 6 X2 0.1694908 7 X3 0.7446906 8 X3 0.7446906 9 X3 0.7446906 </code></pre> Each variable (X1, X2,X3) has values estimated at 3 different times (which in this toy example happen to be the same, but this is never the case). I would like to get it (back) in the form of : <pre class="prettyprint"><code> X1 X2 X3 1 0.5285376 0.1694908 0.7446906 2 0.5285376 0.1694908 0.7446906 3 0.5285376 0.1694908 0.7446906 </code></pre> Basically, I would like the variable column to be sorted on ID (X1, X2 etc) and become column headings. I have tried various permutations of cast, dcast, recast, etc.. and cant seem to get the data in the format that I want. It was easy enough to 'melt' data from the wide form to the longer form (e.g. the dat datset), but getting it back is proving difficult. Any ideas? I know this is relatively simple, but I am having a hard time conceptualizing how to do this in reshape or reshape2. Thanks, LP

I typically do this by creating an id column and then using <code>dcast</code>: <pre class="prettyprint"><code>> dat variable value 1 X1 0.4299397 2 X1 0.4299397 3 X1 0.4299397 4 X2 0.2531551 5 X2 0.2531551 6 X2 0.2531551 7 X3 0.3972119 8 X3 0.3972119 9 X3 0.3972119 > dat$id <- rep(1:3,times = 3) > dcast(data = dat,formula = id~variable,fun.aggregate = sum,value.var = "value") id X1 X2 X3 1 1 0.4299397 0.2531551 0.3972119 2 2 0.4299397 0.2531551 0.3972119 3 3 0.4299397 0.2531551 0.3972119 </code></pre>

How to "unmelt" data with reshape r

Tags:

r

reshape

reshape2

I have a data frame that I melted using the reshape package that I would like to "un melt".

here is a toy example of the melted data (real data frame is 500x100 or larger) :

variable<-c(rep("X1",3),rep("X2",3),rep("X3",3))
value<-c(rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3))
dat <-data.frame(variable,value)
dat
 variable     value
1       X1 0.5285376
2       X1 0.5285376
3       X1 0.5285376
4       X2 0.1694908
5       X2 0.1694908
6       X2 0.1694908
7       X3 0.7446906
8       X3 0.7446906
9       X3 0.7446906

Each variable (X1, X2,X3) has values estimated at 3 different times (which in this toy example happen to be the same, but this is never the case).

I would like to get it (back) in the form of :

     X1        X2        X3
1 0.5285376 0.1694908 0.7446906
2 0.5285376 0.1694908 0.7446906
3 0.5285376 0.1694908 0.7446906

Basically, I would like the variable column to be sorted on ID (X1, X2 etc) and become column headings. I have tried various permutations of cast, dcast, recast, etc.. and cant seem to get the data in the format that I want. It was easy enough to 'melt' data from the wide form to the longer form (e.g. the dat datset), but getting it back is proving difficult. Any ideas? I know this is relatively simple, but I am having a hard time conceptualizing how to do this in reshape or reshape2.

Thanks, LP

763

asked Sep 19 '14 13:09

LP_640

2 Answers

I typically do this by creating an id column and then using dcast:

> dat
  variable     value
1       X1 0.4299397
2       X1 0.4299397
3       X1 0.4299397
4       X2 0.2531551
5       X2 0.2531551
6       X2 0.2531551
7       X3 0.3972119
8       X3 0.3972119
9       X3 0.3972119
> dat$id <- rep(1:3,times = 3)
> dcast(data = dat,formula = id~variable,fun.aggregate = sum,value.var = "value")
  id        X1        X2        X3
1  1 0.4299397 0.2531551 0.3972119
2  2 0.4299397 0.2531551 0.3972119
3  3 0.4299397 0.2531551 0.3972119

185

answered Sep 20 '22 17:09

joran

Depending on how robust you need this to be , the following will correctly cast for varying number of occurrences of variables (and in any order).

> variable<-c(rep("X1",5),rep("X2",4),rep("X3",3))
> value<-c(rep(rnorm(1,.5,.2),5),rep(rnorm(1,.5,.2),4),rep(rnorm(1,.5,.2),3))
> dat <-data.frame(variable,value)
> dat <- dat[order(rnorm(nrow(dat))),]
> dat
   variable     value
11       X3 1.0294454
8        X2 0.6147509
2        X1 0.3537012
7        X2 0.6147509
9        X2 0.6147509
5        X1 0.3537012
4        X1 0.3537012
12       X3 1.0294454
3        X1 0.3537012
1        X1 0.3537012
10       X3 1.0294454
6        X2 0.6147509
> dat$id = numeric(nrow(dat))
> for (i in 1:nrow(dat)){
+   dat_temp <- dat[1:i,]
+   dat[i,]$id <- nrow(dat_temp[dat_temp$variable == dat[i,]$variable,])
+ }
> cast(dat, id~variable, value = 'value')
  id        X1        X2       X3
1  1 0.3537012 0.6147509 1.029445
2  2 0.3537012 0.6147509 1.029445
3  3 0.3537012 0.6147509 1.029445
4  4 0.3537012 0.6147509       NA
5  5 0.3537012        NA       NA

answered Sep 19 '22 17:09

Leo

Related questions
                            
                                Large Matrices in R: long vectors not supported yet
                            
                                GBM R function: get variable importance separately for each class
                            
                                Use pipe without feeding first argument
                            
                                How to apply geom_smooth() for every group?
                            
                                No RTools compatible with R version 3.5.0 was found
                            
                                Summarise to return the length by group
                            
                                R crashing while displaying ggplot after update (process memory read out of range)
                            
                                How to escape % in roxygen literate programming?
                            
                                Aggregate by factor levels, keeping other variables in the resulting data frame
                            
                                Customise x-axis ticks
                            
                                How to remove rows with 0 values using R
                            
                                Fast partial string matching in R
                            
                                Shrink DT::dataTableOutput Size
                            
                                command line arguments in bash to Rscript
                            
                                R equivalent to MATLAB's "stop if error"
                            
                                Why are " preferred over ' in R
                            
                                Subsetting data.table by 2nd column only of a 2 column key, using binary search not vector scan
                            
                                Emoticons in Twitter Sentiment Analysis in r
                            
                                Is there a quick way to get the R equivalent of ls() in Python?
                            
                                export data frames to Excel via xlsx with conditional formatting

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With