Rearranging data frame columns in R (mutate, dplyr)

Tags:

I have a data frame like so

Type  Number  Species
A     1         G    
A     2         R 
A     7         Q
A     4         L
B     4         S
B     5         T
B     3         H
B     9         P
C     12        K
C     11        T
C     6         U
C     5         Q

Where I have used group_by(Type) My goal is to collapse this data by having NUMBER be the top 2 values in the number column, and then making a new column(Number_2) that is the second 2 values. Also I would want the Species values for the bottom two numbers to be deleted, so that the species corresponds to the higher number in the row I would like to use dplyr and the final would look like this

Type  Number Number_2   Species       
A     7    1               Q
A     4    2               L 
B     5    3               T
B     9    4               P
C     12   6               K
C     11   5               T

as of now the order that number_2 is in doesn't matter, as long as it is in the same type.... I don't know if this is possible but if it is does anyone know how...

thanks!

654

asked Jul 07 '15 14:07

user4999605

3 Answers

You can try

library(data.table)
setDT(df1)[order(-Number), list(Number1=Number[1:2], 
                                Number2=Number[3:4],
                                Species=Species[1:2]), keyby = Type]
 #   Type Number1 Number2 Species
 #1:    A       7       2       Q
 #2:    A       4       1       L
 #3:    B       9       4       P
 #4:    B       5       3       T
 #5:    C      12       6       K
 #6:    C      11       5       T

Or using dplyr with do

 library(dplyr)
 df1 %>% 
   group_by(Type) %>%
   arrange(desc(Number)) %>%
   do(data.frame(Type=.$Type[1L],
                Number1=.$Number[1:2], 
                Number2 = .$Number[3:4],
                Species=.$Species[1:2], stringsAsFactors=FALSE))
 #   Type Number1 Number2 Species
 #1    A       7       2       Q
 #2    A       4       1       L
 #3    B       9       4       P
 #4    B       5       3       T
 #5    C      12       6       K
 #6    C      11       5       T

answered Oct 17 '22 12:10

akrun

Here's a different dplyr approach.

library(dplyr)

# Start creating the data set with top 2 values and store as df1:
df1 <- df %>% 
  group_by(Type) %>%
  top_n(2, Number) %>%
  ungroup() %>%
  arrange(Type, Number)

# Then, get the anti-joined data (the not top 2 values), arrange, rename and select
# the number colummn and cbind to df1:
out <- df %>%
  anti_join(df1, c("Type","Number")) %>%
  arrange(Type, Number) %>%
  select(Number2 = Number) %>%
  cbind(df1, .)

This results in:

> out
#  Type Number Species Number2
#1    A      4       L       1
#2    A      7       Q       2
#3    B      5       T       3
#4    B      9       P       4
#5    C     11       T       5
#6    C     12       K       6

answered Oct 17 '22 12:10

talat

This could be another option using ddply

library(plyr)
ddply(dat[order(Number)], .(Type), summarize, 
      Number1 = Number[4:3],  Number2 = Number[2:1], Species = Species[4:3])

#  Type Number1 Number2 Species
#1    A       7       2       Q
#2    A       4       1       L
#3    B       9       4       P
#4    B       5       3       T
#5    C      12       6       K
#6    C      11       5       T

answered Oct 17 '22 12:10

Veerendra Gadekar

Related questions
                            
                                How to remove a function from an R script?
                            
                                Why are these expressions not identical?
                            
                                Import multiple excel sheets using openxlsx
                            
                                How I can find missing numbers in consecutive numbers?
                            
                                How to count a change of number in a matrix in R?
                            
                                Printing regression coefficients from multiple models to a shared data frame
                            
                                How to create pre-annotated rowside column in heatmap.2
                            
                                Calculating an average in a data frame based on locations from separate columns
                            
                                Label outliers using mvOutlier from MVN in R
                            
                                Matching vector values by records in a data frame in R
                            
                                Heat map per column with ggplot2
                            
                                R dplyr summarise one column value based on index of fun(another column)
                            
                                How to draw a ggplot2 with facet_wrap, showing percentages from each group, not overall percentages?
                            
                                Find regular expression in one column and add to a new column in same dataframe
                            
                                RODBC Cannot allocate memory
                            
                                rearrange data.frame to get the sequential order of products
                            
                                Which regular expression engine type does R use as a standard?
                            
                                Remove duplicated words from data frame
                            
                                sapply() with strsplit in R
                            
                                R shiny Dashboard: How to add vertical scrollbar to dashboard sidebar?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Rearranging data frame columns in R (mutate, dplyr)

Tags:

dataframe

r

dplyr

user4999605

People also ask

3 Answers

akrun

talat

Veerendra Gadekar

Recent Activity

Donate For Us