I'm not even sure how to title the question properly!
Suppose I have a dataframe d:
Current dataframe:
d <- data.frame(sample = LETTERS[1:2], cat = letters[11:20], count = c(1:10))
sample cat count
1 A k 1
2 B l 2
3 A m 3
4 B n 4
5 A o 5
6 B p 6
7 A q 7
8 B r 8
9 A s 9
10 B t 10
and I'm trying to re-arrange things such that each cat value becomes a column of its own, sample remains a column (or becomes the row name), and count will be the values in the new cat columns, with 0 where a sample doesn't have a count for a cat. Like so:
Desired dataframe layout:
sample k l m n o p q r s t
1 A 1 0 3 0 5 0 7 0 9 0
2 B 0 2 0 4 0 6 0 8 0 10
What's the best way to go about this?
This is as far as I've gotten:
for (i in unique(d$sample)) {
s <- d[d$sample==i,]
st <- as.data.frame(t(s[,3]))
colnames(st) <- s$cat
rownames(st) <- i
}
i.e. looping through the samples in the original data frame, and transposing for each sample subset. So in this case I get
k m o q s
A 1 3 5 7 9
and
l n p r t
B 2 4 6 8 10
And this is where I get stuck. I've tried a bunch of things with merge
, bind
, apply
,... but I can't seem to hit on the right thing. Plus, I can't help but wonder if that loop above is a necessary step at all - something with unstack
perhaps?
Needless to say, I'm new to R... If someone can help me out, it would be greatly appreciated!
PS Reason I'm trying to re-arrange my dataframe is in the hopes of making plotting of the values easier (i.e. I want to show the actual df in a plot in table format).
Thank you!
To change the row order in an R data frame, we can use single square brackets and provide the row order at first place.
Use dcast
from reshape2 package
> dcast(d, sample~cat, fill=0)
sample k l m n o p q r s t
1 A 1 0 3 0 5 0 7 0 9 0
2 B 0 2 0 4 0 6 0 8 0 10
xtabs
from base is another alternative
> xtabs(count~sample+cat, d)
cat
sample k l m n o p q r s t
A 1 0 3 0 5 0 7 0 9 0
B 0 2 0 4 0 6 0 8 0 10
If you prefer the output to be a data.frame, then try:
> as.data.frame.matrix(xtabs(count~sample+cat, d))
k l m n o p q r s t
A 1 0 3 0 5 0 7 0 9 0
B 0 2 0 4 0 6 0 8 0 10
Using reshape
from base R:
nn<-reshape(d,timevar="cat",idvar="sample",direction="wide")
names(nn)[-1]<-as.character(d$cat)
nn[is.na(nn)]<-0
> nn
sample k l m n o p q r s t
1 A 1 0 3 0 5 0 7 0 9 0
2 B 0 2 0 4 0 6 0 8 0 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With