Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reshape data by ID and add all values belong to one person in one row in R

Tags:

r

reshape

I have a data frame looks like this:

ID  V1 V2  V3
1   1  2   3
1   2  3   4
1   3  4   5
2   3  4   5
3   4  5   6
3   2  3   4

I need to reshape the data frame so that all the records belong to one person present in the same row like this:

ID V1 V2 V3 V1_2 V2_2 V3_2 V1_3 V2_3 V3_3
1  1  2  3  2     3    4    3    4    5
2  3  4  5
3  4  5  6  2     3    4  

Because each person has different number of records, the new data frame will have different length in each row. How can I achieve this?

like image 904
Wendy Avatar asked Feb 18 '14 16:02

Wendy


People also ask

What function do we use to take multiple rows of data and condense them by adding more columns?

gather( ) function: To reformat the data such that these common attributes are gathered together as a single variable, the gather() function will take multiple columns and collapse them into key-value pairs, duplicating all other columns as needed.

How do I convert multiple columns to rows in R?

Thus, to convert columns of an R data frame into rows we can use transpose function t. For example, if we have a data frame df with five columns and five rows then we can convert the columns of the df into rows by using as. data. frame(t(df)).

How does reshape work in R?

The Reshape Package Basically, you "melt" data so that each row is a unique id-variable combination. Then you "cast" the melted data into any shape you would like.


1 Answers

One straightforward way is with the reshape2 package, but you must add a secondary ID.

### The next line creates a secondary ID variable
mydf$ID2 <- ave(mydf$ID, mydf$ID, FUN = seq_along)

library(reshape2)
dfL <- melt(mydf, id.vars=c("ID", "ID2"))
dcast(dfL, ID ~ variable + ID2)
#   ID V1_1 V1_2 V1_3 V2_1 V2_2 V2_3 V3_1 V3_2 V3_3
# 1  1    1    2    3    2    3    4    3    4    5
# 2  2    3   NA   NA    4   NA   NA    5   NA   NA
# 3  3    4    2   NA    5    3   NA    6    4   NA

Alternatively, after having added "ID2" as indicated above, you can also do the reshaping directly with base R's reshape. The column orders are different, but the data is the same.

reshape(mydf, direction = "wide", idvar="ID", timevar="ID2")
#   ID V1.1 V2.1 V3.1 V1.2 V2.2 V3.2 V1.3 V2.3 V3.3
# 1  1    1    2    3    2    3    4    3    4    5
# 4  2    3    4    5   NA   NA   NA   NA   NA   NA
# 5  3    4    5    6    2    3    4   NA   NA   NA
like image 143
A5C1D2H2I1M1N2O1R2T1 Avatar answered Nov 03 '22 19:11

A5C1D2H2I1M1N2O1R2T1