I have these data:
structure(list(type = c("journal", "all", "similar_age_1m", "similar_age_3m",
"similar_age_journal_1m", "similar_age_journal_3m"), count = c("13972",
"754555", "22408", "56213", "508", "1035"), rank = c("13759",
"754043", "22339", "56074", "459", "947"), pct = c("98.48", "99.93",
"99.69", "99.75", "90.35", "91.50")), .Names = c("type", "count",
"rank", "pct"), row.names = c(NA, -6L), class = "data.frame")
I'd like to turn it into a single row, with names of columns 2:4
prefixed by the corresponding type. e.g. journal.count
, journal.rank
... What is the fastest way to do this? For some reason dcast
and reshape
are not doing it for me and my solution is a little too cumbersome.
You mentioned reshape2
, so here is a way with that:
library("reshape2")
dcast(melt(dat, id.var="type"), 1~variable+type)
That gives:
1 count_all count_journal count_similar_age_1m count_similar_age_3m
1 1 754555 13972 22408 56213
count_similar_age_journal_1m count_similar_age_journal_3m rank_all
1 508 1035 754043
rank_journal rank_similar_age_1m rank_similar_age_3m
1 13759 22339 56074
rank_similar_age_journal_1m rank_similar_age_journal_3m pct_all pct_journal
1 459 947 99.93 98.48
pct_similar_age_1m pct_similar_age_3m pct_similar_age_journal_1m
1 99.69 99.75 90.35
pct_similar_age_journal_3m
1 91.50
The type
and variable are separated with _
, instead of .
, though.
Here's another way:
y <- as.numeric(as.matrix(x[-1])) # flatten the data.frame
names(y) <- as.vector(outer(x[['type']], names(x)[-1], paste, sep='.'))
Assuming you are OK with adding a dummy "time" variable for the reshaping, you can do this easily with base R also. Assuming your data.frame
is called:
mydf$id <- 1
(mydfw <- reshape(mydf, direction = "wide", idvar="id", timevar="type"))
# id count.journal rank.journal pct.journal count.all rank.all pct.all
# 1 1 13972 13759 98.48 754555 754043 99.93
# count.similar_age_1m rank.similar_age_1m pct.similar_age_1m
# 1 22408 22339 99.69
# count.similar_age_3m rank.similar_age_3m pct.similar_age_3m
# 1 56213 56074 99.75
# count.similar_age_journal_1m rank.similar_age_journal_1m
# 1 508 459
# pct.similar_age_journal_1m count.similar_age_journal_3m
# 1 90.35 1035
# rank.similar_age_journal_3m pct.similar_age_journal_3m
# 1 947 91.50
Cleanup is not too bad either, if you want to reorder your columns.
mydfw <- mydfw[, unlist(sapply(names(mydf), grep, names(mydfw)))]
Here's a solution using expand.grid
to get the names.
To get the data, first, subset to remove the first column which contains names. Then, transpose and convert to numeric.
> eg <- expand.grid(colnames(x[, -1]), x[, 1])
> setNames(as.numeric(t(x[, -1])), paste(eg[[2]], eg[[1]], sep="."))
journal.count journal.rank
13972.00 13759.00
journal.pct all.count
98.48 754555.00
all.rank all.pct
754043.00 99.93
similar_age_1m.count similar_age_1m.rank
22408.00 22339.00
similar_age_1m.pct similar_age_3m.count
99.69 56213.00
similar_age_3m.rank similar_age_3m.pct
56074.00 99.75
similar_age_journal_1m.count similar_age_journal_1m.rank
508.00 459.00
similar_age_journal_1m.pct similar_age_journal_3m.count
90.35 1035.00
similar_age_journal_3m.rank similar_age_journal_3m.pct
947.00 91.50
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With