I have a quite large dataframe structured like this: <pre class="prettyprint"><code>id x1 x2 x3 y1 y2 y3 z1 z2 z3 v 1 2 4 5 10 20 15 200 150 170 2.5 2 3 7 6 25 35 40 300 350 400 4.2 </code></pre> I need to create a dataframe like this: <pre class="prettyprint"><code>id xsource xvalue yvalue zvalue v 1 x1 2 10 200 2.5 1 x2 4 20 150 2.5 1 x3 5 15 170 2.5 2 x1 3 25 300 4.2 2 x2 7 35 350 4.2 2 x3 6 40 400 4.2 </code></pre> I'm quite sure I have to do it with the reshape package, but I'm not able to get what I want. Could you help me? Thanks

Here's one approach that use <code>reshape2</code> and is described in depth in my paper on tidy data. Step 1: identify the variables that are already in columns. In this case: id, and v. These are the variables we melt by <pre class="prettyprint"><code>library(reshape2) xm <- melt(x, c("id", "v")) </code></pre> Step 2: split up variables that are currently combined in one column. In this case that's source (the character part) and rep (the integer part): There are lots of ways to do this, I'm going to use string extraction with the <code>stringr</code> package <pre class="prettyprint"><code>library(stringr) xm$source <- str_sub(xm$variable, 1, 1) xm$rep <- str_sub(xm$variable, 2, 2) xm$variable <- NULL </code></pre> Step 3: rearrange the variables that currently in the rows but we want in columns: <pre class="prettyprint"><code>dcast(xm, ... ~ source) # id v rep x y z # 1 1 2.5 1 2 10 200 # 2 1 2.5 2 4 20 150 # 3 1 2.5 3 5 15 170 # 4 2 4.2 1 3 25 300 # 5 2 4.2 2 7 35 350 # 6 2 4.2 3 6 40 400 </code></pre>

How to reshape this dataframe with the reshape package [duplicate]

Tags:

r

reshape

I have a quite large dataframe structured like this:

id    x1    x2    x3    y1    y2    y3    z1    z2    z3     v 
 1     2     4     5    10    20    15   200   150   170   2.5
 2     3     7     6    25    35    40   300   350   400   4.2

I need to create a dataframe like this:

id   xsource   xvalue   yvalue   zvalue       v 
 1        x1        2       10      200     2.5
 1        x2        4       20      150     2.5
 1        x3        5       15      170     2.5
 2        x1        3       25      300     4.2
 2        x2        7       35      350     4.2
 2        x3        6       40      400     4.2

I'm quite sure I have to do it with the reshape package, but I'm not able to get what I want.

Could you help me?

Thanks

303

asked Jan 13 '12 15:01

corrado

3 Answers

Here's the reshape() solution.

The key bit is that the varying= argument can take a list of vectors of column names in the wide format that correspond to single variables in the long format. In this case, columns "x1", "x2", "x3" in the original data frame are sent to one column in the long data frame, columns "y1, y2, y3" will go into a second column, and so on.

# Read in the original data, x, from Andrie's answer

res <- reshape(x, direction = "long", idvar = "id",
               varying = list(c("x1","x2", "x3"), 
                              c("y1", "y2", "y3"), 
                              c("z1", "z2", "z3")),
               v.names = c("xvalue", "yvalue", "zvalue"), 
               timevar = "xsource", times = c("x1", "x2", "x3"))
#      id   v xsource xvalue yvalue zvalue
# 1.x1  1 2.5      x1      2     10    200
# 2.x1  2 4.2      x1      3     25    300
# 1.x2  1 2.5      x2      4     20    150
# 2.x2  2 4.2      x2      7     35    350
# 1.x3  1 2.5      x3      5     15    170
# 2.x3  2 4.2      x3      6     40    400

Finally, a couple of purely cosmetic steps are needed to get the results looking exactly as shown in your question:

res <- res[order(res$id, res$xsource), c(1,3,4,5,6,2)]
row.names(res) <- NULL
res
#   id xsource xvalue yvalue zvalue   v
# 1  1      x1      2     10    200 2.5
# 2  1      x2      4     20    150 2.5
# 3  1      x3      5     15    170 2.5
# 4  2      x1      3     25    300 4.2
# 5  2      x2      7     35    350 4.2
# 6  2      x3      6     40    400 4.2

162

answered Oct 22 '22 01:10

Josh O'Brien

Here's one approach that use reshape2 and is described in depth in my paper on tidy data.

Step 1: identify the variables that are already in columns. In this case: id, and v. These are the variables we melt by

library(reshape2)
xm <- melt(x, c("id", "v"))

Step 2: split up variables that are currently combined in one column. In this case that's source (the character part) and rep (the integer part):

There are lots of ways to do this, I'm going to use string extraction with the stringr package

library(stringr)
xm$source <- str_sub(xm$variable, 1, 1)
xm$rep <- str_sub(xm$variable, 2, 2)
xm$variable <- NULL

Step 3: rearrange the variables that currently in the rows but we want in columns:

dcast(xm, ... ~ source)

#   id   v rep x  y   z
# 1  1 2.5     1 2 10 200
# 2  1 2.5     2 4 20 150
# 3  1 2.5     3 5 15 170
# 4  2 4.2     1 3 25 300
# 5  2 4.2     2 7 35 350
# 6  2 4.2     3 6 40 400

answered Oct 22 '22 01:10

hadley

Somebody please prove me wrong, but I don't think it's easy to solve this problem using either the reshape package or the base reshape function.

However, it's easy enough using lapply and do.call:

Replicate the data:

x <- read.table(text="
id    x1    x2    x3    y1    y2    y3    z1    z2    z3     v 
1     2     4     5    10    20    15   200   150   170   2.5
2     3     7     6    25    35    40   300   350   400   4.2
", header=TRUE)

Do the analysis

chunks <- lapply(1:nrow(x), 
    function(i)cbind(x[i, 1], 1:3, matrix(x[i, 2:10], ncol=3), x[i, 11]))
res <- do.call(rbind, chunks)
colnames(res) <- c("id", "source", "x", "y", "z", "v")
res

     id source x y  z   v  
[1,] 1  1      2 10 200 2.5
[2,] 1  2      4 20 150 2.5
[3,] 1  3      5 15 170 2.5
[4,] 2  1      3 25 300 4.2
[5,] 2  2      7 35 350 4.2
[6,] 2  3      6 40 400 4.2

answered Oct 22 '22 02:10

Andrie

Related questions
                            
                                Statistics Question: Kernel Smoothing in R
                            
                                Getting Started with Sweave, Eclipse, and R
                            
                                Clustering Photos in R?
                            
                                dputting an S4 object
                            
                                Given a random variable with probability density function f(x), how to compute the expected value of this random variable in R?
                            
                                Plotting several jpeg images in a single display
                            
                                Workaround for pointers in R?
                            
                                Include jar file when creating an R package
                            
                                Serializing .RData file to database
                            
                                Using sd as a generic function in R
                            
                                Error with custom aggregate function for a cast() call in R reshape2
                            
                                R Statistics: Average True Range Trailing Stop indicator
                            
                                R Batch Mode - Suppress output file
                            
                                In R, what is the keyword for jumping out of a function without executing the rest of it?
                            
                                Multi-Dimensional Array vs. List of list tuples
                            
                                How to use lapply with a formula?
                            
                                Approximate lookup in R
                            
                                Conditional formatting: making cells colorful
                            
                                Why does loading cached objects increase the memory consumption drastically when computing them will not?
                            
                                Remove columns with same value from a dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With