I have a data frame looks like this: <pre class="prettyprint"><code>ID V1 V2 V3 1 1 2 3 1 2 3 4 1 3 4 5 2 3 4 5 3 4 5 6 3 2 3 4 </code></pre> I need to reshape the data frame so that all the records belong to one person present in the same row like this: <pre class="prettyprint"><code>ID V1 V2 V3 V1_2 V2_2 V3_2 V1_3 V2_3 V3_3 1 1 2 3 2 3 4 3 4 5 2 3 4 5 3 4 5 6 2 3 4 </code></pre> Because each person has different number of records, the new data frame will have different length in each row. How can I achieve this?

One straightforward way is with the <code>reshape2</code> package, but you must add a secondary ID. <pre class="prettyprint"><code>### The next line creates a secondary ID variable mydf$ID2 <- ave(mydf$ID, mydf$ID, FUN = seq_along) library(reshape2) dfL <- melt(mydf, id.vars=c("ID", "ID2")) dcast(dfL, ID ~ variable + ID2) # ID V1_1 V1_2 V1_3 V2_1 V2_2 V2_3 V3_1 V3_2 V3_3 # 1 1 1 2 3 2 3 4 3 4 5 # 2 2 3 NA NA 4 NA NA 5 NA NA # 3 3 4 2 NA 5 3 NA 6 4 NA </code></pre> Alternatively, after having added "ID2" as indicated above, you can also do the reshaping directly with base R's <code>reshape</code>. The column orders are different, but the data is the same. <pre class="prettyprint"><code>reshape(mydf, direction = "wide", idvar="ID", timevar="ID2") # ID V1.1 V2.1 V3.1 V1.2 V2.2 V3.2 V1.3 V2.3 V3.3 # 1 1 1 2 3 2 3 4 3 4 5 # 4 2 3 4 5 NA NA NA NA NA NA # 5 3 4 5 6 2 3 4 NA NA NA </code></pre>

reshape data by ID and add all values belong to one person in one row in R

Tags:

r

reshape

I have a data frame looks like this:

ID  V1 V2  V3
1   1  2   3
1   2  3   4
1   3  4   5
2   3  4   5
3   4  5   6
3   2  3   4

I need to reshape the data frame so that all the records belong to one person present in the same row like this:

ID V1 V2 V3 V1_2 V2_2 V3_2 V1_3 V2_3 V3_3
1  1  2  3  2     3    4    3    4    5
2  3  4  5
3  4  5  6  2     3    4

Because each person has different number of records, the new data frame will have different length in each row. How can I achieve this?

904

asked Feb 18 '14 16:02

Wendy

1 Answers

One straightforward way is with the reshape2 package, but you must add a secondary ID.

### The next line creates a secondary ID variable
mydf$ID2 <- ave(mydf$ID, mydf$ID, FUN = seq_along)

library(reshape2)
dfL <- melt(mydf, id.vars=c("ID", "ID2"))
dcast(dfL, ID ~ variable + ID2)
#   ID V1_1 V1_2 V1_3 V2_1 V2_2 V2_3 V3_1 V3_2 V3_3
# 1  1    1    2    3    2    3    4    3    4    5
# 2  2    3   NA   NA    4   NA   NA    5   NA   NA
# 3  3    4    2   NA    5    3   NA    6    4   NA

Alternatively, after having added "ID2" as indicated above, you can also do the reshaping directly with base R's reshape. The column orders are different, but the data is the same.

reshape(mydf, direction = "wide", idvar="ID", timevar="ID2")
#   ID V1.1 V2.1 V3.1 V1.2 V2.2 V3.2 V1.3 V2.3 V3.3
# 1  1    1    2    3    2    3    4    3    4    5
# 4  2    3    4    5   NA   NA   NA   NA   NA   NA
# 5  3    4    5    6    2    3    4   NA   NA   NA

143

answered Nov 03 '22 19:11

A5C1D2H2I1M1N2O1R2T1

Related questions
                            
                                data.table loses factor ordering after rbind, R
                            
                                Set different positions of axis labels and tick marks in a barplot
                            
                                read.table unexpectedly interprets "T" as TRUE
                            
                                eval and substitute in C/C++
                            
                                How to loop through and modify multiple data frames in R
                            
                                Minimum evaluatable scientific value?
                            
                                Using data.table to aggregate
                            
                                Sample A CSV File Too Large To Load Into R?
                            
                                Mosaic plot with labels in each box showing a name and percentage of all observations
                            
                                Determine Cause of `identical()` returning FALSE
                            
                                Accessing environment variables set in R session from shell
                            
                                ggplot is not working properly inside a function despite working outside it - R
                            
                                R textConnection: "argument 'object' must deparse to a single character string"
                            
                                Run R scripts on a NAS Synology ds214
                            
                                filepath check in R both absolute and relative
                            
                                Juxtapose tableGrob with ggplot2 y-axis
                            
                                Find the most repeated row in a matrix
                            
                                How to set specific contrasts in multinom() in nnet package?
                            
                                Passing arguments in nonlinear optimization function `nloptr`
                            
                                Adjusting distance between groups of bars in ggplot2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With