Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort a data frame by user-defined (e.g. non-alphabetic order) [duplicate]

Given a data frame dna

> dna
chrom   start
chr2    39482
chr1    203918
chr1    198282
chrX    7839028
chr17   3874

The following code reorders dna by $chrom in alphabetical ascending order and by $start in numerical ascending order:

> dna <- dna[with(dna, order(chrom, start)), ]
> dna
chrom   start
chr1    198282
chr1    203918
chr17   3874
chr2    39482
chrX    7839028

However, I would like to be able to have $chrom ordered as follows (simplified for the sake of my example here):

chrom_order <- c("chr1","chr2", "chr17", "chrX")

I am not allowed to rename stuff, for example chr1 to chr01.

like image 874
biohazard Avatar asked Feb 04 '14 10:02

biohazard


People also ask

Which function is used to sort a data frame?

To sort a data frame in R, use the order( ) function. By default, sorting is ASCENDING.

How does arrange () work in R?

The arrange() function lets you reorder the rows of a tibble. It takes a tibble, followed by the unquoted names of columns. For example, to sort in ascending order of the values of column x , then (where there is a tie in x ) by descending order of values of y , you would write the following.

How do I sort a Pandas DataFrame based on a column?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.


2 Answers

You need to specify the levels in factor and then use order with indexing:

zz <- "chrom   start
chr2    39482
chr1    203918
chr1    198282
chrX    7839028
chr17   3874"
Data <- read.table(text=zz, header = TRUE)

library(Hmisc)
library(gdata)

Data$chrom  <- reorder.factor(Data$chrom , levels = c("chr1","chr2", "chr17", "chrX"))

Data[order(Data$chrom), ]
  chrom   start
2  chr1  203918
3  chr1  198282
1  chr2   39482
5 chr17    3874
4  chrX 7839028  

or you can use this:

> Data$chrom  <- factor(chrom , levels = c("chr1","chr2", "chr17", "chrX"))
> Data[order(Data$chrom), ]
  chrom   start
2  chr1  203918
3  chr1  198282
1  chr2   39482
5 chr17    3874
4  chrX 7839028

or use this:

> Data$chrom <- reorder(Data$chrom, new.order=c("chr1","chr2", "chr17", "chrX"))
> Data[order(Data$chrom), ]
like image 51
Prasanna Nandakumar Avatar answered Nov 14 '22 23:11

Prasanna Nandakumar


Try this:

dna <- structure(list(chrom = structure(c(2L, 1L, 1L, 4L, 3L), .Label = c("chr1", 
"chr2", "chr17", "chrX"), class = c("ordered", "factor")), start = c(39482L, 
203918L, 198282L, 7839028L, 3874L)), .Names = c("chrom", "start"
), row.names = c(NA, -5L), class = "data.frame")

chrom_order <- c("chr1","chr2", "chr17", "chrX")

# Make chrom column ordered. Second term defines the order
dna$chrom <- ordered(dna$chrom, chrom_order)
dna[with(dna, order(chrom, start)),]

 chrom   start
3  chr1  198282
2  chr1  203918
1  chr2   39482
5 chr17    3874
4  chrX 7839028
like image 35
Mikko Avatar answered Nov 15 '22 00:11

Mikko