I'm not even sure how to title the question properly! Suppose I have a dataframe d: Current dataframe: <pre class="prettyprint"><code>d <- data.frame(sample = LETTERS[1:2], cat = letters[11:20], count = c(1:10)) sample cat count 1 A k 1 2 B l 2 3 A m 3 4 B n 4 5 A o 5 6 B p 6 7 A q 7 8 B r 8 9 A s 9 10 B t 10 </code></pre> and I'm trying to re-arrange things such that each cat value becomes a column of its own, sample remains a column (or becomes the row name), and count will be the values in the new cat columns, with 0 where a sample doesn't have a count for a cat. Like so: Desired dataframe layout: <pre class="prettyprint"><code> sample k l m n o p q r s t 1 A 1 0 3 0 5 0 7 0 9 0 2 B 0 2 0 4 0 6 0 8 0 10 </code></pre> What's the best way to go about this? This is as far as I've gotten: <pre class="prettyprint"><code>for (i in unique(d$sample)) { s <- d[d$sample==i,] st <- as.data.frame(t(s[,3])) colnames(st) <- s$cat rownames(st) <- i } </code></pre> i.e. looping through the samples in the original data frame, and transposing for each sample subset. So in this case I get <pre class="prettyprint"><code> k m o q s A 1 3 5 7 9 </code></pre> and <pre class="prettyprint"><code> l n p r t B 2 4 6 8 10 </code></pre> And this is where I get stuck. I've tried a bunch of things with <code>merge</code>, <code>bind</code>, <code>apply</code>,... but I can't seem to hit on the right thing. Plus, I can't help but wonder if that loop above is a necessary step at all - something with <code>unstack</code> perhaps? Needless to say, I'm new to R... If someone can help me out, it would be greatly appreciated! PS Reason I'm trying to re-arrange my dataframe is in the hopes of making plotting of the values easier (i.e. I want to show the actual df in a plot in table format). Thank you!

Use <code>dcast</code> from reshape2 package <pre class="prettyprint"><code>> dcast(d, sample~cat, fill=0) sample k l m n o p q r s t 1 A 1 0 3 0 5 0 7 0 9 0 2 B 0 2 0 4 0 6 0 8 0 10 </code></pre> <code>xtabs</code> from base is another alternative <pre class="prettyprint"><code>> xtabs(count~sample+cat, d) cat sample k l m n o p q r s t A 1 0 3 0 5 0 7 0 9 0 B 0 2 0 4 0 6 0 8 0 10 </code></pre> If you prefer the output to be a data.frame, then try: <pre class="prettyprint"><code>> as.data.frame.matrix(xtabs(count~sample+cat, d)) k l m n o p q r s t A 1 0 3 0 5 0 7 0 9 0 B 0 2 0 4 0 6 0 8 0 10 </code></pre>

Using <code>reshape</code> from base R: <pre class="prettyprint"><code>nn<-reshape(d,timevar="cat",idvar="sample",direction="wide") names(nn)[-1]<-as.character(d$cat) nn[is.na(nn)]<-0 > nn sample k l m n o p q r s t 1 A 1 0 3 0 5 0 7 0 9 0 2 B 0 2 0 4 0 6 0 8 0 10 </code></pre>

R re-arrange dataframe: some rows to columns

Tags:

dataframe

r

reshape

tidyr

I'm not even sure how to title the question properly!

Suppose I have a dataframe d:

Current dataframe:

d <- data.frame(sample = LETTERS[1:2], cat = letters[11:20], count = c(1:10))

   sample cat count
1       A   k     1
2       B   l     2
3       A   m     3
4       B   n     4
5       A   o     5
6       B   p     6
7       A   q     7
8       B   r     8
9       A   s     9
10      B   t    10

and I'm trying to re-arrange things such that each cat value becomes a column of its own, sample remains a column (or becomes the row name), and count will be the values in the new cat columns, with 0 where a sample doesn't have a count for a cat. Like so:

Desired dataframe layout:

   sample   k   l   m   n   o   p   q   r   s   t
1       A   1   0   3   0   5   0   7   0   9   0
2       B   0   2   0   4   0   6   0   8   0  10

What's the best way to go about this?

This is as far as I've gotten:

for (i in unique(d$sample)) {
    s <- d[d$sample==i,]
    st <- as.data.frame(t(s[,3]))
    colnames(st) <- s$cat
    rownames(st) <- i
}

i.e. looping through the samples in the original data frame, and transposing for each sample subset. So in this case I get

   k m o q s
 A 1 3 5 7 9

and

   l n p r  t
 B 2 4 6 8 10

And this is where I get stuck. I've tried a bunch of things with merge, bind, apply,... but I can't seem to hit on the right thing. Plus, I can't help but wonder if that loop above is a necessary step at all - something with unstack perhaps?

Needless to say, I'm new to R... If someone can help me out, it would be greatly appreciated!

PS Reason I'm trying to re-arrange my dataframe is in the hopes of making plotting of the values easier (i.e. I want to show the actual df in a plot in table format).

Thank you!

885

asked Oct 13 '13 14:10

crs

2 Answers

Use dcast from reshape2 package

> dcast(d, sample~cat, fill=0)
  sample k l m n o p q r s  t
1      A 1 0 3 0 5 0 7 0 9  0
2      B 0 2 0 4 0 6 0 8 0 10

xtabs from base is another alternative

> xtabs(count~sample+cat, d)
      cat
sample  k  l  m  n  o  p  q  r  s  t
     A  1  0  3  0  5  0  7  0  9  0
     B  0  2  0  4  0  6  0  8  0 10

If you prefer the output to be a data.frame, then try:

> as.data.frame.matrix(xtabs(count~sample+cat, d))
  k l m n o p q r s  t
A 1 0 3 0 5 0 7 0 9  0
B 0 2 0 4 0 6 0 8 0 10

193

answered Oct 20 '22 00:10

Jilber Urbina

Using reshape from base R:

nn<-reshape(d,timevar="cat",idvar="sample",direction="wide")
names(nn)[-1]<-as.character(d$cat)
nn[is.na(nn)]<-0
> nn
  sample k l m n o p q r s  t
1      A 1 0 3 0 5 0 7 0 9  0
2      B 0 2 0 4 0 6 0 8 0 10

answered Oct 20 '22 01:10

Metrics

Related questions
                            
                                Make a user-created function in R
                            
                                Rolling joins: roll forwards and backwards
                            
                                Combining elements of list of lists by index
                            
                                Custom levels in ggplot2 contour plot?
                            
                                Why do ncol and nrow only yield NULL when I do have data?
                            
                                Optimal/efficient plotting of survival/regression analysis results
                            
                                How to efficiently calculate distance between pair of coordinates using data.table :=
                            
                                How do I convert Rd files to pdf for a package that I am creating in R?
                            
                                Convert igraph object to a data frame in R
                            
                                How do I store "arrays" of statistical models?
                            
                                Finding the indexes of multiple/overlapping matching substrings
                            
                                Access list element using get()
                            
                                How do I overlay an image on to a ggplot?
                            
                                How to create a stacked line plot
                            
                                Extracting data used to make a smooth plot in mgcv
                            
                                How to name the unnamed first column of a data.frame
                            
                                Fisher test error : LDSTP is too small
                            
                                Parallel distance Matrix in R
                            
                                How to remove ordering of the levels from factor variable in R?
                            
                                Error while reading csv file in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With