Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - How to rearrange rows in a data frame while maintaining their grouping?

Tags:

r

group-by

dplyr

I have the following data frame. The rows are currently ordered in groups of species, i.e. the sorgho group followed by the poacee group.

# create a dataset
specie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) )
condition <- rep(c("normal" , "stress" , "Nitrogen") , 4)
value <- abs(rnorm(12 , 0 , 15))
data <- data.frame(specie,condition,value)
print(data)

     specie condition     value
1    sorgho    normal 12.623696
2    sorgho    stress 11.394047
3    sorgho  Nitrogen  0.498003
4    poacee    normal 14.589322
5    poacee    stress 10.744153
6    poacee  Nitrogen  7.299742
7    banana    normal  9.845850
8    banana    stress  9.416088
9    banana  Nitrogen  4.178521
10 triticum    normal 13.230663
11 triticum    stress 30.658355
12 triticum  Nitrogen  9.402721

How can I rearrange these groupings so that they are in order of decreasing nitrogen value? I want the data frame to be reorganized to resemble this:

     specie condition     value
10 triticum    normal 13.230663
11 triticum    stress 30.658355
12 triticum  Nitrogen  9.402721
4    poacee    normal 14.589322
5    poacee    stress 10.744153
6    poacee  Nitrogen  7.299742
7    banana    normal  9.845850
8    banana    stress  9.416088
9    banana  Nitrogen  4.178521
1    sorgho    normal 12.623696
2    sorgho    stress 11.394047
3    sorgho  Nitrogen  0.498003
like image 820
braun_tube Avatar asked Dec 03 '20 23:12

braun_tube


People also ask

What does arrange () do in R?

The arrange() function in R programming is used to reorder the rows of a data frame/table by using column names. These columns are passed as the expression in the function.

How do I sort a row in R?

To sort each row of an R data frame in increasing order, we can use apply function for sorting the columns and then transpose the output. For example, if we have a data frame called df that contains 5 columns then each row of df can be sorted in increasing order by using the command t(apply(df,1,sort)).


2 Answers

We can filter the 'Nitrogen' rows, arrange 'value' in the descending order, extract the 'specie' and use that as levels to `arrange the 'specie' column

library(dplyr)
lvls <- data %>% 
      filter(condition == 'Nitrogen') %>% 
      arrange(desc(value)) %>% 
      pull(specie) 
data %>% 
        arrange(factor(specie, levels = lvls))%>%
        as_tibble

-output

# A tibble: 12 x 3
#   specie   condition  value
#   <chr>    <chr>      <dbl>
# 1 triticum normal    13.2  
# 2 triticum stress    30.7  
# 3 triticum Nitrogen   9.40 
# 4 poacee   normal    14.6  
# 5 poacee   stress    10.7  
# 6 poacee   Nitrogen   7.30 
# 7 banana   normal     9.85 
# 8 banana   stress     9.42 
# 9 banana   Nitrogen   4.18 
#10 sorgho   normal    12.6  
#11 sorgho   stress    11.4  
#12 sorgho   Nitrogen   0.498

Or doing this in a single pipe

data %>%
    arrange(factor(specie, levels = 
          unique(specie)[order(-value[condition == 'Nitrogen'])]))

Or using base R

data[order(with(data, factor(specie, levels = 
     unique(specie)[order(-value[condition == "Nitrogen"])]))),]

data

data <- structure(list(specie = c("sorgho", "sorgho", "sorgho", "poacee", 
"poacee", "poacee", "banana", "banana", "banana", "triticum", 
"triticum", "triticum"), condition = c("normal", "stress", "Nitrogen", 
"normal", "stress", "Nitrogen", "normal", "stress", "Nitrogen", 
"normal", "stress", "Nitrogen"), value = c(12.623696, 11.394047, 
0.498003, 14.589322, 10.744153, 7.299742, 9.84585, 9.416088, 
4.178521, 13.230663, 30.658355, 9.402721)), class = "data.frame",
row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"))
like image 118
akrun Avatar answered Oct 30 '22 06:10

akrun


Another base R option (less elegant than @akrun's base R solution)

with(
  data,
  do.call(
    rbind,
    split(data, factor(specie, levels = unique(specie)))[order(-value[condition == "Nitrogen"])]
  )
)

which gives

              specie condition     value
triticum.10 triticum    normal 13.230663
triticum.11 triticum    stress 30.658355
triticum.12 triticum  Nitrogen  9.402721
poacee.4      poacee    normal 14.589322
poacee.5      poacee    stress 10.744153
poacee.6      poacee  Nitrogen  7.299742
banana.7      banana    normal  9.845850
banana.8      banana    stress  9.416088
banana.9      banana  Nitrogen  4.178521
sorgho.1      sorgho    normal 12.623696
sorgho.2      sorgho    stress 11.394047
sorgho.3      sorgho  Nitrogen  0.498003
like image 45
ThomasIsCoding Avatar answered Oct 30 '22 06:10

ThomasIsCoding