Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Melt and cast data table using pattern

Tags:

r

data.table

The data.table package added a new feature to melt data into multiple columns simultaneously. This is very useful, but I can't figure out how to preserve the "suffix" of the pre-melted variable names. For example:

library(data.table)

# create data table
dt <- data.table(id = seq(3), a_3 = seq(3), a_4 = seq(4, 6), b_3 = seq(7, 9), b_4 = seq(10, 12))

# melt and cast in one step using new feature
m1 <- melt(dt, id.vars='id', measure=patterns("a_", "b_"), value.name=c("a_", "b_"))

Results in the data table:

   id variable a_ b_
1:  1        1  1  7
2:  2        1  2  8
3:  3        1  3  9
4:  1        2  4 10
5:  2        2  5 11
6:  3        2  6 12

This is the "shape" I want, but the variables a_3, a_4, b_3 and b_4 have been indexed 1 and 2. What I want is the variable column to contain 3,3,3,4,4,4, according to the suffixes of the variable names.

I could obviously do this the "old-fashioned" way with melt, strsplit, dcast, but that's kind of cumbersome. I'm hoping for a one-line solution that's still very fast.

like image 386
dmp Avatar asked Jan 27 '16 21:01

dmp


People also ask

What is melting and casting of data?

Melting and Casting are one of the interesting aspects in R programming to change the shape of the data and further, getting the desired shape. R programming language has many methods to reshape the data using reshape package. melt() and cast() are the functions that efficiently reshape the data.

What does melt () do in R?

The melt() function in R programming is an in-built function. It enables us to reshape and elongate the data frames in a user-defined manner. It organizes the data values in a long data frame format.

What is Dcast function in R?

dcast: Convert data between wide and long forms.

What does melt () do?

The melt() function is used to convert a data frame with several measurement columns into a data frame in this canonical format, which has one row for every observed (measured) value. Let's melt data frame about states, with eight observations per row.


Video Answer


1 Answers

We can do this with splitstackshape. It gives the '.time_1' column automatically

library(splitstackshape)
merged.stack(dt, var.stubs=c("a", "b"), sep="_")
#   id .time_1 a  b
#1:  1       3 1  7
#2:  1       4 4 10
#3:  2       3 2  8
#4:  2       4 5 11
#5:  3       3 3  9
#6:  3       4 6 12
like image 189
akrun Avatar answered Sep 22 '22 08:09

akrun