Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restructure Data in R

I am just starting to get beyond the basics in R and have come to a point where I need some help. I want to restructure some data. Here is what a sample dataframe may look like:

ID  Sex Res Contact
1   M   MA  ABR
1   M   MA  CON
1   M   MA  WWF
2   F   FL  WIT
2   F   FL  CON
3   X   GA  XYZ

I want the data to look like:

ID  SEX Res ABR CON WWF WIT XYZ
1   M   MA  1   1   1   0   0
2   F   FL  0   1   0   1   0
3   X   GA  0   0   0   0   1

What are my options? How would I do this in R?

In short, I am looking to keep the values of the CONT column and use them as column names in the restructred data frame. I want to hold a variable set of columns constant (in th example above, I held ID, Sex, and Res constant).

Also, is it possible to control the values in the restructured data? I may want to keep the data as binary. I may want some data to have the value be the count of times each contact value exists for each ID.

like image 982
Btibert3 Avatar asked Aug 12 '10 17:08

Btibert3


People also ask

What does the reshape function do in R?

Description. This function reshapes a data frame between 'wide' format with repeated measurements in separate columns of the same record and 'long' format with the repeated measurements in separate records.

How do I convert long data to wide data in R?

To convert long data back into a wide format, we can use the cast function. There are many cast functions, but we will use the dcast function because it is used for data frames.


2 Answers

The reshape package is what you want. Documentation here: http://had.co.nz/reshape/. Not to toot my own horn, but I've also written up some notes on reshape's use here: http://www.ling.upenn.edu/~joseff/rstudy/summer2010_reshape.html

For your purpose, this code should work

library(reshape)
data$value <- 1
cast(data, ID + Sex + Res ~ Contact, fun = "length")
like image 73
JoFrhwld Avatar answered Oct 14 '22 16:10

JoFrhwld


model.matrix works great (this was asked recently, and gappy had this good answer):

> model.matrix(~ factor(d$Contact) -1)
  factor(d$Contact)ABR factor(d$Contact)CON factor(d$Contact)WIT factor(d$Contact)WWF factor(d$Contact)XYZ
1                    1                    0                    0                    0                    0
2                    0                    1                    0                    0                    0
3                    0                    0                    0                    1                    0
4                    0                    0                    1                    0                    0
5                    0                    1                    0                    0                    0
6                    0                    0                    0                    0                    1
attr(,"assign")
[1] 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$`factor(d$Contact)`
[1] "contr.treatment"
like image 36
Vince Avatar answered Oct 14 '22 18:10

Vince