Order data frame rows according to vector with specific order

People also ask

How do I sort a specific order in R?

order() in R The numbers are ordered according to its index by using order(x) . Here the order() will sort the given numbers according to its index in the ascending order. Since number 2 is the smallest, which has an index as five and number 4 is index 1, and similarly, the process moves forward in the same pattern.

How do you reorder data sets?

Open the Status Definition window and click the Data Set tab. On the Data Set tab, highlight (or double-click) the row number of the element you want to reorder. Enter the new row number the element should occupy. Repeat step 2 for any other data elements you want to reorder.

Try match:

df <- data.frame(name=letters[1:4], value=c(rep(TRUE, 2), rep(FALSE, 2)))
target <- c("b", "c", "a", "d")
df[match(target, df$name),]

  name value
2    b  TRUE
3    c FALSE
1    a  TRUE
4    d FALSE

It will work as long as your target contains exactly the same elements as df$name, and neither contain duplicate values.

From ?match:

match returns a vector of the positions of (first) matches of its first argument 
in its second.

Therefore match finds the row numbers that matches target's elements, and then we return df in that order.

I prefer to use ***_join in dplyr whenever I need to match data. One possible try for this

left_join(data.frame(name=target),df,by="name")

Note that the input for ***_join require tbls or data.frame

We can adjust the factor levels based on target and use it in arrange

library(dplyr)
df %>% arrange(factor(name, levels = target))

#  name value
#1    b  TRUE
#2    c FALSE
#3    a  TRUE
#4    d FALSE

Or order it and use it in slice

df %>% slice(order(factor(name, levels = target)))

This method is a bit different, it provided me with a bit more flexibility than the previous answer. By making it into an ordered factor, you can use it nicely in arrange and such. I used reorder.factor from the gdata package.

df <- data.frame(name=letters[1:4], value=c(rep(TRUE, 2), rep(FALSE, 2)))
target <- c("b", "c", "a", "d")

require(gdata)
df$name <- reorder.factor(df$name, new.order=target)

Next, use the fact that it is now ordered:

require(dplyr)
df %>%
  arrange(name)
    name value
1    b  TRUE
2    c FALSE
3    a  TRUE
4    d FALSE

If you want to go back to the original (alphabetic) ordering, just use as.character() to get it back to the original state.

If you don't want to use any libraries and you have reoccurrences in your data, you can use which with sapply as well.

new_order <- sapply(target, function(x,df){which(df$name == x)}, df=df)
df        <- df[new_order,]

Here's a similar system for the situation where you have a variable you want to sort by, initially, but then you want to sort by a secondary variable according to the order that this secondary variable first appears in the initial sort.

In the function below, the initial sort variable is called order_by and the secondary variable is called order_along - as in "order by this variable along its initial order".

library(dplyr, warn.conflicts = FALSE)
df <- structure(
  list(
    msoa11hclnm = c(
      "Bewbush", "Tilgate", "Felpham",
      "Selsey", "Brunswick", "Ratton", "Ore", "Polegate", "Mile Oak",
      "Upperton", "Arundel", "Kemptown"
    ),
    lad20nm = c(
      "Crawley", "Crawley",
      "Arun", "Chichester", "Brighton and Hove", "Eastbourne", "Hastings",
      "Wealden", "Brighton and Hove", "Eastbourne", "Arun", "Brighton and Hove"
    ),
    shape_area = c(
      1328821, 3089180, 3540014, 9738033, 448888, 10152663, 5517102,
      7036428, 5656430, 2653589, 72832514, 826151
    )
  ),
  row.names = c(NA, -12L), class = "data.frame"
)

this does not give me what I need:

df %>% 
  dplyr::arrange(shape_area, lad20nm)
#>    msoa11hclnm           lad20nm shape_area
#> 1    Brunswick Brighton and Hove     448888
#> 2     Kemptown Brighton and Hove     826151
#> 3      Bewbush           Crawley    1328821
#> 4     Upperton        Eastbourne    2653589
#> 5      Tilgate           Crawley    3089180
#> 6      Felpham              Arun    3540014
#> 7          Ore          Hastings    5517102
#> 8     Mile Oak Brighton and Hove    5656430
#> 9     Polegate           Wealden    7036428
#> 10      Selsey        Chichester    9738033
#> 11      Ratton        Eastbourne   10152663
#> 12     Arundel              Arun   72832514

Here’s a function:

order_along <- function(df, order_along, order_by) {
  cols <- colnames(df)
  
  df <- df %>%
    dplyr::arrange({{ order_by }})
  
  df %>% 
    dplyr::select({{ order_along }}) %>% 
    dplyr::distinct() %>% 
    dplyr::full_join(df) %>% 
    dplyr::select(dplyr::all_of(cols))
  
}

order_along(df, lad20nm, shape_area)
#> Joining, by = "lad20nm"
#>    msoa11hclnm           lad20nm shape_area
#> 1    Brunswick Brighton and Hove     448888
#> 2     Kemptown Brighton and Hove     826151
#> 3     Mile Oak Brighton and Hove    5656430
#> 4      Bewbush           Crawley    1328821
#> 5      Tilgate           Crawley    3089180
#> 6     Upperton        Eastbourne    2653589
#> 7       Ratton        Eastbourne   10152663
#> 8      Felpham              Arun    3540014
#> 9      Arundel              Arun   72832514
#> 10         Ore          Hastings    5517102
#> 11    Polegate           Wealden    7036428
#> 12      Selsey        Chichester    9738033

^{Created on 2021-01-12 by the reprex package (v0.3.0)}

Related questions
                            
                                Filter data.frame rows by a logical condition
                            
                                R memory management / cannot allocate vector of size n Mb
                            
                                Force the origin to start at 0
                            
                                What does %>% mean in R [duplicate]
                            
                                Convert data.frame column format from character to factor
                            
                                Error in if/while (condition) {: missing Value where TRUE/FALSE needed
                            
                                How do I clear only a few specific objects from the workspace?
                            
                                Replace all 0 values to NA
                            
                                What's the difference between `1L` and `1`?
                            
                                Find file name from full file path
                            
                                Repeat each row of data.frame the number of times specified in a column
                            
                                What do hjust and vjust do when making a plot using ggplot?
                            
                                Most underused data visualization [closed]
                            
                                R: += (plus equals) and ++ (plus plus) equivalent from c++/c#/java, etc.?
                            
                                How to count TRUE values in a logical vector
                            
                                Importing data from a JSON file into R [duplicate]
                            
                                How to remove all whitespace from a string?
                            
                                Fastest way to find second (third...) highest/lowest value in vector or column
                            
                                Problems installing the devtools package
                            
                                Difference between R MarkDown and R NoteBook

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Order data frame rows according to vector with specific order

Tags:

sorting

dataframe

r

People also ask

Recent Activity

Donate For Us