I have a data frame that I melted using the reshape package that I would like to "un melt".
here is a toy example of the melted data (real data frame is 500x100 or larger) :
variable<-c(rep("X1",3),rep("X2",3),rep("X3",3))
value<-c(rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3))
dat <-data.frame(variable,value)
dat
variable value
1 X1 0.5285376
2 X1 0.5285376
3 X1 0.5285376
4 X2 0.1694908
5 X2 0.1694908
6 X2 0.1694908
7 X3 0.7446906
8 X3 0.7446906
9 X3 0.7446906
Each variable (X1, X2,X3) has values estimated at 3 different times (which in this toy example happen to be the same, but this is never the case).
I would like to get it (back) in the form of :
X1 X2 X3
1 0.5285376 0.1694908 0.7446906
2 0.5285376 0.1694908 0.7446906
3 0.5285376 0.1694908 0.7446906
Basically, I would like the variable column to be sorted on ID (X1, X2 etc) and become column headings. I have tried various permutations of cast, dcast, recast, etc.. and cant seem to get the data in the format that I want. It was easy enough to 'melt' data from the wide form to the longer form (e.g. the dat datset), but getting it back is proving difficult. Any ideas? I know this is relatively simple, but I am having a hard time conceptualizing how to do this in reshape or reshape2.
Thanks, LP
In this case, the remaining data variables are treated as data values, and there are only two columns: variable and value. Unmelting, on the other hand, is performed on the data variables to return the values to their original format. After we’ve covered Melting and Unmelting data, let’s look at the Pandas functions that allow us to do the same.
The Exit of the Program. The pivot () function can be used to unmelt a DataFrame object and retrieve the original dataframe. The ‘index’ argument value of the pivot () function should be the same as the ‘id vars’ value. The value of ‘columns’ should be used as the name of the ‘variable’ column.
Pivoting, Unmelting or Reverse Melting is used to convert a column with multiple values into several columns of their own. Create a dataframe that contains the data on ID, Name, Marks and Sports of 6 students. Unmelting around the column Sports:
Reshaping plays a crucial role in data analysis. Pandas provide function like melt and unmelt for reshaping. melt () is used to convert a wide dataframe into a longer form. This function can be used when there are requirements to consider a specific column as an identifier.
I typically do this by creating an id column and then using dcast
:
> dat
variable value
1 X1 0.4299397
2 X1 0.4299397
3 X1 0.4299397
4 X2 0.2531551
5 X2 0.2531551
6 X2 0.2531551
7 X3 0.3972119
8 X3 0.3972119
9 X3 0.3972119
> dat$id <- rep(1:3,times = 3)
> dcast(data = dat,formula = id~variable,fun.aggregate = sum,value.var = "value")
id X1 X2 X3
1 1 0.4299397 0.2531551 0.3972119
2 2 0.4299397 0.2531551 0.3972119
3 3 0.4299397 0.2531551 0.3972119
Depending on how robust you need this to be , the following will correctly cast for varying number of occurrences of variables (and in any order).
> variable<-c(rep("X1",5),rep("X2",4),rep("X3",3))
> value<-c(rep(rnorm(1,.5,.2),5),rep(rnorm(1,.5,.2),4),rep(rnorm(1,.5,.2),3))
> dat <-data.frame(variable,value)
> dat <- dat[order(rnorm(nrow(dat))),]
> dat
variable value
11 X3 1.0294454
8 X2 0.6147509
2 X1 0.3537012
7 X2 0.6147509
9 X2 0.6147509
5 X1 0.3537012
4 X1 0.3537012
12 X3 1.0294454
3 X1 0.3537012
1 X1 0.3537012
10 X3 1.0294454
6 X2 0.6147509
> dat$id = numeric(nrow(dat))
> for (i in 1:nrow(dat)){
+ dat_temp <- dat[1:i,]
+ dat[i,]$id <- nrow(dat_temp[dat_temp$variable == dat[i,]$variable,])
+ }
> cast(dat, id~variable, value = 'value')
id X1 X2 X3
1 1 0.3537012 0.6147509 1.029445
2 2 0.3537012 0.6147509 1.029445
3 3 0.3537012 0.6147509 1.029445
4 4 0.3537012 0.6147509 NA
5 5 0.3537012 NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With