Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating longitudinal datasets with reshape

I have the dataset:

top100_repository_name  month   monthly_increase    monthly_begin_at    monthly_end_with
Bukkit                  2012-03 9                   431                 440
Bukkit                  2012-04 19                  438                 457
Bukkit                  2012-05 19                  455                 474
CodeIgniter             2012-03 15                  492                 507
CodeIgniter             2012-04 50                  506                 556
CodeIgniter             2012-05 19                  555                 574

I use the following R code:

library(reshape)
latent.growth.data <- read.csv(file = "LGC_data.csv", header = TRUE)
melt(latent.growth.data, id = c("top100_repository_name", "month"), measured = c("monthly_end_with"))
cast(latent.growth.data, top100_repository_name + month ~ monthly_end_with)

Which I want to use to create a dataset that has the following structure:

top100_repository_name    2012-03    2012-04    2012-05
Bukkit                    440        457        474
CodeIgniter               507        556        574

However, when I run my code I get the following output:

Using monthly_end_with as value column.  Use the value argument to cast to override this choice
Error in `[.data.frame`(data, , variables, drop = FALSE) : 
  undefined columns selected

How can I modify my code so that I generate the desired output?

like image 611
histelheim Avatar asked Dec 04 '25 17:12

histelheim


1 Answers

Someone will be along soon with a plyr solution i'm sure, but here is a base solution using the reshape function.

test <- read.table(textConnection("top100_repository_name  month   monthly_increase    monthly_begin_at    monthly_end_with
Bukkit                  2012-03 9                   431                 440
Bukkit                  2012-04 19                  438                 457
Bukkit                  2012-05 19                  455                 474
CodeIgniter             2012-03 15                  492                 507
CodeIgniter             2012-04 50                  506                 556
CodeIgniter             2012-05 19                  555                 574"),header=TRUE)

Reshape this here data:

test2 <- reshape(
    test[c("top100_repository_name","month","monthly_end_with")],
    idvar="top100_repository_name",
    timevar="month",
    direction="wide"
)

Fix the names

names(test2) <- gsub("monthly_end_with.","",names(test2))

Which looks like:

> test2
  top100_repository_name 2012-03 2012-04 2012-05
1                 Bukkit     440     457     474
4            CodeIgniter     507     556     574
like image 149
thelatemail Avatar answered Dec 06 '25 08:12

thelatemail