Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

stacking columns in data.frame into one column in R

I'm having trouble stacking columns in a data.frame into one column. Now my data looks something like this:

id   time    black   white   red 
a     1       b1      w1     r1
a     2       b2      w2     r2
a     3       b3      w3     r3
b     1       b4      w4     r4
b     2       b5      w5     r5
b     3       b6      w6     r6

I'm trying to transform the data.frame so that it looks like this:

id   time  colour 
a     1     b1
a     2     b2
a     3     b3
b     1     b4
b     2     b5
b     3     b6
a     1     w1
a     2     w2
a     3     w3
b     1     w4
b     2     w5
b     3     w6
a     1     r1
a     2     r2
.     .     .
.     .     .
.     .     .

I'm guessing that this problem requires using the reshape package, but I'm not exactly sure how to use it to stack multiple columns under one column. Can anyone provide help on this?

like image 671
econlearner Avatar asked Nov 28 '12 03:11

econlearner


People also ask

How do I combine two columns in a DataFrame in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do you join columns in R?

2. Using dplyr to Join Multiple Columns in R. Using join functions from dplyr package is the best approach to join data frames on multiple columns in R, all dplyr join functions inner_join(), left_join(), right_join(), full_join(), anti_join(), semi_join() support joining on multiple columns.

Can you group columns in R?

Group By Multiple Columns in R using dplyrUse group_by() function in R to group the rows in DataFrame by multiple columns (two or more), to use this function, you have to install dplyr first using install. packages('dplyr') and load it using library(dplyr) .


2 Answers

Since you mention "stacking" in your title, you can also look at the stack function in base R:

cbind(mydf[1:2], stack(mydf[3:5]))
#    id time values   ind
# 1   a    1     b1 black
# 2   a    2     b2 black
# 3   a    3     b3 black
# 4   b    1     b4 black
# 5   b    2     b5 black
# 6   b    3     b6 black
# 7   a    1     w1 white
# 8   a    2     w2 white
# 9   a    3     w3 white
# 10  b    1     w4 white
# 11  b    2     w5 white
# 12  b    3     w6 white
# 13  a    1     r1   red
# 14  a    2     r2   red
# 15  a    3     r3   red
# 16  b    1     r4   red
# 17  b    2     r5   red
# 18  b    3     r6   red

If the values in the "black", "white", and "red" columns are factors, you'll need to convert them to character values first.

cbind(mydf[1:2], stack(lapply(mydf[3:5], as.character)))
like image 170
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 23 '22 11:10

A5C1D2H2I1M1N2O1R2T1


Here's melt from reshape:

library(reshape)
melt(x, id.vars=c('id', 'time'),var='color')

And using reshape2 (an up-to-date, faster version of reshape) the syntax is almost identical.

The help files have useful examples (see ?melt and the link to melt.data.frame).

In your case, something like the following will work (assuming your data.frame is called DF)

library(reshape2)
melt(DF, id.var = c('id','time'), variable.name = 'colour')
like image 28
Matthew Lundberg Avatar answered Oct 23 '22 10:10

Matthew Lundberg