Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort a data.frame by multiple columns whose names are contained in a single object?

Tags:

r

I want to sort a data.frame by multiple columns, ideally using base R without any external packages (though if necessary, so be it). Having read How to sort a dataframe by column(s)?, I know I can accomplish this with the order() function as long as I either:

  1. Know the explicit names of each of the columns.
  2. Have a separate object representing each individual column by which to sort.

But what if I only have one vector containing multiple column names, of length that's unknown in advance?

Say the vector is called sortnames.

data[order(data[, sortnames]), ] won't work, because order() treats that as a single sorting argument.

data[order(data[, sortnames[1]], data[, sortnames[2]], ...), ] will work if and only if I specify the exact correct number of sortname values, which I won't know in advance.

Things I've looked at but not been totally happy with:

  1. eval(parse(text=paste("data[with(data, order(", paste(sortnames, collapse=","), ")), ]"))). Maybe this is fine, but I've seen plenty of hate for using eval(), so asking for alternatives seemed worthwhile.
  2. I may be able to use the Deducer library to do this with sortData(), but like I said, I'd rather avoid using external packages.

If I'm being too stubborn about not using external packages, let me know. I'll get over it. All ideas appreciated in advance!

like image 798
MDe Avatar asked May 08 '13 13:05

MDe


People also ask

How do you sort a DataFrame based on multiple columns?

You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order. To specify the order, you have to use ascending boolean property; False for descending and True for ascending. By default, it is set to True.

How do you sort columns in a data frame?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.

Can you group by multiple columns in R?

Group By Multiple Columns in R using dplyrUse group_by() function in R to group the rows in DataFrame by multiple columns (two or more), to use this function, you have to install dplyr first using install. packages('dplyr') and load it using library(dplyr) . All functions in dplyr package take data.

How do you sort column names in a DataFrame in Python?

To sort a DataFrame based on column names in descending Order, we can call sort_index() on the DataFrame object with argument axis=1 and ascending=False i.e.


1 Answers

You can use do.call:

data<-data.frame(a=rnorm(10),b=rnorm(10)) 
data<-data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
sortnames <- c("a", "b")
data[do.call("order", data[sortnames]), ]

This trick is useful when you want to pass multiple arguments to a function and these arguments are in convenient named list.

like image 122
mpiktas Avatar answered Sep 27 '22 20:09

mpiktas