Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Turning a data.frame into a single row

Tags:

r

reshape

plyr

I have these data:

structure(list(type = c("journal", "all", "similar_age_1m", "similar_age_3m", 
"similar_age_journal_1m", "similar_age_journal_3m"), count = c("13972", 
"754555", "22408", "56213", "508", "1035"), rank = c("13759", 
"754043", "22339", "56074", "459", "947"), pct = c("98.48", "99.93", 
"99.69", "99.75", "90.35", "91.50")), .Names = c("type", "count", 
"rank", "pct"), row.names = c(NA, -6L), class = "data.frame")

I'd like to turn it into a single row, with names of columns 2:4 prefixed by the corresponding type. e.g. journal.count, journal.rank ... What is the fastest way to do this? For some reason dcast and reshape are not doing it for me and my solution is a little too cumbersome.

like image 359
Maiasaura Avatar asked Sep 28 '12 20:09

Maiasaura


4 Answers

You mentioned reshape2, so here is a way with that:

library("reshape2")
dcast(melt(dat, id.var="type"), 1~variable+type)

That gives:

  1 count_all count_journal count_similar_age_1m count_similar_age_3m
1 1    754555         13972                22408                56213
  count_similar_age_journal_1m count_similar_age_journal_3m rank_all
1                          508                         1035   754043
  rank_journal rank_similar_age_1m rank_similar_age_3m
1        13759               22339               56074
  rank_similar_age_journal_1m rank_similar_age_journal_3m pct_all pct_journal
1                         459                         947   99.93       98.48
  pct_similar_age_1m pct_similar_age_3m pct_similar_age_journal_1m
1              99.69              99.75                      90.35
  pct_similar_age_journal_3m
1                      91.50

The type and variable are separated with _, instead of ., though.

like image 65
Brian Diggs Avatar answered Oct 18 '22 11:10

Brian Diggs


Here's another way:

y <- as.numeric(as.matrix(x[-1])) # flatten the data.frame
names(y) <- as.vector(outer(x[['type']], names(x)[-1], paste, sep='.'))
like image 45
Matthew Plourde Avatar answered Oct 18 '22 11:10

Matthew Plourde


Assuming you are OK with adding a dummy "time" variable for the reshaping, you can do this easily with base R also. Assuming your data.frame is called:

mydf$id <- 1
(mydfw <- reshape(mydf, direction = "wide", idvar="id", timevar="type"))
#   id count.journal rank.journal pct.journal count.all rank.all pct.all
# 1  1         13972        13759       98.48    754555   754043   99.93
#   count.similar_age_1m rank.similar_age_1m pct.similar_age_1m
# 1                22408               22339              99.69
#   count.similar_age_3m rank.similar_age_3m pct.similar_age_3m
# 1                56213               56074              99.75
#   count.similar_age_journal_1m rank.similar_age_journal_1m
# 1                          508                         459
#   pct.similar_age_journal_1m count.similar_age_journal_3m
# 1                      90.35                         1035
#   rank.similar_age_journal_3m pct.similar_age_journal_3m
# 1                         947                      91.50

Cleanup is not too bad either, if you want to reorder your columns.

mydfw <- mydfw[, unlist(sapply(names(mydf), grep, names(mydfw)))]
like image 43
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 18 '22 10:10

A5C1D2H2I1M1N2O1R2T1


Here's a solution using expand.grid to get the names.

To get the data, first, subset to remove the first column which contains names. Then, transpose and convert to numeric.

> eg <- expand.grid(colnames(x[, -1]), x[, 1])
> setNames(as.numeric(t(x[, -1])), paste(eg[[2]], eg[[1]], sep="."))
               journal.count                 journal.rank 
                    13972.00                     13759.00 
                 journal.pct                    all.count 
                       98.48                    754555.00 
                    all.rank                      all.pct 
                   754043.00                        99.93 
        similar_age_1m.count          similar_age_1m.rank 
                    22408.00                     22339.00 
          similar_age_1m.pct         similar_age_3m.count 
                       99.69                     56213.00 
         similar_age_3m.rank           similar_age_3m.pct 
                    56074.00                        99.75 
similar_age_journal_1m.count  similar_age_journal_1m.rank 
                      508.00                       459.00 
  similar_age_journal_1m.pct similar_age_journal_3m.count 
                       90.35                      1035.00 
 similar_age_journal_3m.rank   similar_age_journal_3m.pct 
                      947.00                        91.50 
like image 25
GSee Avatar answered Oct 18 '22 10:10

GSee