Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Str_sort within column while preserving order in the data.frame [duplicate]

Tags:

sorting

r

stringr

I'm trying to sort the strings inside the 'specs' column, but whenever I use str_sort (stringr), it successfully sorts the strings in 'specs' but also the entire column and does not preserve the row structure. The 'sorted' column is the result of the following code:

nest_use %>%
  mutate(sorted = str_sort(specs))
    nest  days Date         age specs            no_specs sorted          
          
 1   595    86 2020:07:03    80 arlo, bird              2 arlo, bird      
 2   595    86 2020:08:05    80 tato, bird              2 arlo, bird      
 3   595    86 2020:08:22    80 arlo, unk               2 arlo, bird      
 4   595    86 2020:09:11    80 unk, glor               2 arlo, bird      
 5   595    86 2020:09:19    80 glor, unk               2 arlo, bird      
 6   595    86 2020:10:14    80 glor, unk               2 arlo, bird      
 7   595    86 2020:10:16    80 tado, arlo, glor        3 arlo, bird      
 8   595    86 2020:10:19    80 glor, unk               2 arlo, bird, glor
 9   595    86 2020:10:20    80 unk, glor               2 arlo, bird, tado
10   595    86 2020:10:22    80 glor, arlo, bird        3 arlo, corvid    
# ... with 93 more rows

What I would like to see is the following output as a data.frame where the strings in 'specs' are sorted and the order of rows is preserved:

    nest  days Date         age specs            no_specs sorted          
   <int> <int> <chr>      <int> <chr>               <dbl> <chr>           
 1   595    86 2020:07:03    80 arlo, bird              2 arlo, bird      
 2   595    86 2020:08:05    80 tato, bird              2 bird, tato      
 3   595    86 2020:08:22    80 arlo, unk               2 arlo, unk      
 4   595    86 2020:09:11    80 unk, glor               2 glor, unk      
 5   595    86 2020:09:19    80 glor, unk               2 glor, unk      
 6   595    86 2020:10:14    80 glor, unk               2 glor, unk      
 7   595    86 2020:10:16    80 tado, arlo, glor        3 arlo, glor, tado      
 8   595    86 2020:10:19    80 glor, unk               2 glor, unk
 9   595    86 2020:10:20    80 unk, glor               2 glor, unk
10   595    86 2020:10:22    80 glor, arlo, bird        3 arlo, bird, glor    
# ... with 93 more rows

I've searched for quite a while and have not quite found the solution for this issue.

like image 330
Jason P Avatar asked Nov 16 '25 13:11

Jason P


1 Answers

If I understand correctly what you're looking for, you first need to split the source string.

You could apply the following to your specs column. What this does is:

  1. splits the string by ,
  2. sorts the elements of the string
  3. collapses the string, uniting its elements with ,
library("dplyr", warn.conflicts = FALSE)
specs <- c("arlo, bird",
           "tato, bird",
           "arlo, unk",
           "unk, glor",
           "glor, unk",
           "glor, unk",
           "tado, arlo, glor",
           "glor, unk",
           "unk, glor",
           "glor, arlo, bird")

purrr::map_chr(stringr::str_split(specs, ", "),
               .f = function(x) {
                 x %>%
                 stringr::str_sort() %>%
                   stringr::str_c(collapse = ", ")
                 })
#>  [1] "arlo, bird"       "bird, tato"       "arlo, unk"        "glor, unk"       
#>  [5] "glor, unk"        "glor, unk"        "arlo, glor, tado" "glor, unk"       
#>  [9] "glor, unk"        "arlo, bird, glor"

Created on 2022-01-10 by the reprex package (v2.0.1)

Starting from your data frame, the following should achieve what you're looking for:


nest_use %>%
  mutate(sorted = purrr::map_chr(
    stringr::str_split(specs, ", "),
    .f = function(x) {
      x %>%
        stringr::str_sort() %>%
        stringr::str_c(collapse = ", ")
    }))

like image 174
giocomai Avatar answered Nov 19 '25 06:11

giocomai



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!