Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Sort a string of items alphabetically [duplicate]

Tags:

r

I have a data frame, and one of the columns in the data frame contains a string of items separated by commas like this;

[1] "A, D, B, C"
[2] "D, A, B, C"
[3] "B, A, C, D"
etc...

Is there a way to sort these strings within themselves, so that I can get something like this?:

"A, B, C, D"
"A, B, C, D"
"A, B, C, D"

I am close with the following:

library(gtools)
df$col <- sapply(df$col , function (x)
    mixedsort(strsplit(paste(x, collapse = ','), ',')[[1]]))

But this outputs the results as a list, so I can't do any manipulations in dplyr on the output (like group_by)

like image 700
Jason J Avatar asked Nov 16 '17 19:11

Jason J


1 Answers

x = c("a, b, c, d", "d, a, b, c", "b, a, c, d")
y = unname(sapply(x, function(x) {
    paste(sort(trimws(strsplit(x[1], ',')[[1]])), collapse=',')} ))
y

[1] "a,b,c,d" "a,b,c,d" "a,b,c,d"

trimws() removes whitespace so sort works correctly on the splitted string. sort() sorts alphabetically. paste(..., collapse = ',') concatenates the sorted vector of strings into a single string.

like image 181
spinodal Avatar answered Oct 24 '22 15:10

spinodal