How can I order a dataframe by the second column in R? [duplicate]

Tags:

r

Possible Duplicate:
How to sort a dataframe by column(s) in R

I was just wondering if some one could help me out, I have what I thought should be a easy problem to solve.

I have the table below:

SampleID           Cluster

R0132F041p          1

R0132F127           1

R0132F064           1

R0132F068p          1

R0132F015           2

R0132F094           3

R0132F105           1

R0132F013           2

R0132F114           1

R0132F014           2

R0132F039p          3

R0132F137           1

R0132F059           1

R0132F138p          2

R0132F038p          2

and I would like to sort/order it by Cluster to get the results as below:

SampleID    Cluster

R0132F041p  1

R0132F127   1

R0132F064   1

R0132F068p  1

R0132F105   1

R0132F114   1

R0132F137   1

R0132F059   1

R0132F015   2

R0132F013   2

R0132F014   2

R0132F138p  2

R0132F038p  2

R0132F094   3

R0132F039p  3

I have tried the following R code:

data<-read.table('Table.txt', header=TRUE,row.names=1,sep='\t')

data <- data.frame(data)
data <- data[order(data$Cluster),]
write.table(data, file = 'OrderedTable.txt', append = TRUE,quote=FALSE, sep = '\t', na ='NA', dec = '.', row.names = TRUE, col.names = FALSE)

and get the following output:

Why have the SampleIDs been replaced by the numbers 1-15 and what do these numbers represent, I have read the ?order() page however this seems to explain sort.list better than order() if any one could help me out on this I would be very grateful.

661

asked Nov 14 '12 12:11

sinead

2 Answers

The short answer is you did it perfectly. You just are having some difficulty with reading and writing files. Going through your code:

data<-read.table('Table.txt', header=TRUE,row.names=1,sep='\t')

The above line is reading in your data fine, but the row.names=1 told it to use the first column as names for rows. So now your SampleIDs are row names instead of being their own column. If you type data or head(data) or str(data) immediately after running this line, this should be clear. Just omit that row.names argument and it will read properly.

data <- data.frame(data)

You don't need this above line because read.table() produces a dataframe. You can see that with str(data) as well.

data <- data[order(data$Cluster),]

The above line is perfect.

write.table(data, file = 'OrderedTable.txt', append = TRUE,    quote=FALSE, sep = '\t', na ='NA', dec = '.', row.names = TRUE,     col.names = FALSE)

Here you included the argument col.names = FALSE which is why your file doesn't have column names. You also don't need/want append=TRUE. If you look at help(write.table), you see it is "only relevant if file is a character string". Here it seems to make the file write without ending the last line, which would likely cause any later read.table() to complain.

The numbers 1-15 in your result look like row numbers. You don't explain how you look at the resulting file, so I cannot be sure. You likely read your file in a way that doesn't parse the row.names and is showing row numbers instead. If you make certain your SampleIDs column does not get assigned to be names of rows, you'll probably be fine.

116

answered Sep 18 '22 21:09

MattBagg

Have a look at the arrange function of the plyr package.

arrange(data, Cluster)
write.table(data, "ordered_data.txt")

answered Sep 18 '22 21:09

Markus

Related questions
                            
                                What is the R equivalent of matlab's csaps()
                            
                                Table of vector's means by two factors
                            
                                How to improve performance of this linear interpolation
                            
                                Convert multiple list elements to separate data.frame columns
                            
                                Initiate downloadHandler with clientData in Shiny
                            
                                TclTk library issue while install Rcmdr package on MacBookPro [duplicate]
                            
                                Replacement for unique(rbind()) when using data.tables
                            
                                Time Series analysis with R, how to deal with daily data
                            
                                predict with kernlab package error Error in .local(object, ...) : test vector does not match model R
                            
                                R convert vector of numbers to skipping indexes
                            
                                Knitr behavior with date objects
                            
                                How can I read selected rows from a large file using the R "readLines" command and write them to a data frame?
                            
                                How to sum a function over a specific range in R?
                            
                                How to use Sub Function in R
                            
                                Date-time conversion in R
                            
                                Downgrade R version and R package Bioconductor [duplicate]
                            
                                How to smartly place text labels beside points of different sizes in ggplot2?
                            
                                Get currently called function to write anonymous recursive function
                            
                                Adjust position and font size of legend title in ggplot2
                            
                                Some R packages do not update with update.packages()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With