I have these data:
structure(list(type = c("journal", "all", "similar_age_1m", "similar_age_3m",
"similar_age_journal_1m", "similar_age_journal_3m"), count = c("13972",
"754555", "22408", "56213", "508", "1035"), rank = c("13759",
"754043", "22339", "56074", "459", "947"), pct = c("98.48", "99.93",
"99.69", "99.75", "90.35", "91.50")), .Names = c("type", "count",
"rank", "pct"), row.names = c(NA, -6L), class = "data.frame")
I'd like to turn it into a single row, with names of columns 2:4 prefixed by the corresponding type. e.g. journal.count, journal.rank ... What is the fastest way to do this? For some reason dcast and reshape are not doing it for me and my solution is a little too cumbersome.
You mentioned reshape2, so here is a way with that:
library("reshape2")
dcast(melt(dat, id.var="type"), 1~variable+type)
That gives:
1 count_all count_journal count_similar_age_1m count_similar_age_3m
1 1 754555 13972 22408 56213
count_similar_age_journal_1m count_similar_age_journal_3m rank_all
1 508 1035 754043
rank_journal rank_similar_age_1m rank_similar_age_3m
1 13759 22339 56074
rank_similar_age_journal_1m rank_similar_age_journal_3m pct_all pct_journal
1 459 947 99.93 98.48
pct_similar_age_1m pct_similar_age_3m pct_similar_age_journal_1m
1 99.69 99.75 90.35
pct_similar_age_journal_3m
1 91.50
The type and variable are separated with _, instead of ., though.
Here's another way:
y <- as.numeric(as.matrix(x[-1])) # flatten the data.frame
names(y) <- as.vector(outer(x[['type']], names(x)[-1], paste, sep='.'))
Assuming you are OK with adding a dummy "time" variable for the reshaping, you can do this easily with base R also. Assuming your data.frame is called:
mydf$id <- 1
(mydfw <- reshape(mydf, direction = "wide", idvar="id", timevar="type"))
# id count.journal rank.journal pct.journal count.all rank.all pct.all
# 1 1 13972 13759 98.48 754555 754043 99.93
# count.similar_age_1m rank.similar_age_1m pct.similar_age_1m
# 1 22408 22339 99.69
# count.similar_age_3m rank.similar_age_3m pct.similar_age_3m
# 1 56213 56074 99.75
# count.similar_age_journal_1m rank.similar_age_journal_1m
# 1 508 459
# pct.similar_age_journal_1m count.similar_age_journal_3m
# 1 90.35 1035
# rank.similar_age_journal_3m pct.similar_age_journal_3m
# 1 947 91.50
Cleanup is not too bad either, if you want to reorder your columns.
mydfw <- mydfw[, unlist(sapply(names(mydf), grep, names(mydfw)))]
Here's a solution using expand.grid to get the names.
To get the data, first, subset to remove the first column which contains names. Then, transpose and convert to numeric.
> eg <- expand.grid(colnames(x[, -1]), x[, 1])
> setNames(as.numeric(t(x[, -1])), paste(eg[[2]], eg[[1]], sep="."))
journal.count journal.rank
13972.00 13759.00
journal.pct all.count
98.48 754555.00
all.rank all.pct
754043.00 99.93
similar_age_1m.count similar_age_1m.rank
22408.00 22339.00
similar_age_1m.pct similar_age_3m.count
99.69 56213.00
similar_age_3m.rank similar_age_3m.pct
56074.00 99.75
similar_age_journal_1m.count similar_age_journal_1m.rank
508.00 459.00
similar_age_journal_1m.pct similar_age_journal_3m.count
90.35 1035.00
similar_age_journal_3m.rank similar_age_journal_3m.pct
947.00 91.50
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With