Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select the last n columns of data frame in R

Is there a way to systematically select the last columns of a data frame? I would like to be able to move the last columns to be the first columns, but maintain the order of the columns when they are moved. I need a way to do this that does not list all the columns using subset(data, select = c(all the columns listed in the new order)) because I will be using many different data frames.

Here's an example where I would like to move the last 2 columns to the front of the data frame. It works, but it's ugly.

A = rep("A", 5)
B = rep("B", 5)
num1 = c(1:5)
num2 = c(36:40)

mydata2 = data.frame(num1, num2, A, B)

# Move A and B to the front of mydata2
mydata2_move = data.frame(A = mydata2$A, B = mydata2$B, mydata2[,1:    (ncol(mydata2)-2)])

#  A B num1 num2
#1 A B    1   36
#2 A B    2   37
#3 A B    3   38
#4 A B    4   39
#5 A B    5   40

Changing the number of columns in the original data frame causes issues. This works (see below), but the naming gets thrown off. Why do these two examples behave differently? Is there a better way to do this, and to generalize it?

mydata1_move = data.frame(A = mydata1$A, B = mydata1$B, mydata1[,1:   (ncol(mydata1)-2)])

#  A B mydata1...1..ncol.mydata1....2..
#1 A B                                1
#2 A B                                2
#3 A B                                3
#4 A B                                4
#5 A B                                5
like image 662
Nancy Avatar asked Jan 19 '15 02:01

Nancy


People also ask

How do I display last n records from a Dataframe in R?

The last n rows of the data frame can be accessed by using the in-built tail() method in R. Supposedly, N is the total number of rows in the data frame, then n <=N last rows can be extracted from the structure.

How do I select nth column in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.


7 Answers

The problem described doesn't match the title, and existing answers address the moving columns part, doesn't really explain how to select last N columns.

If you wanted to just select the last column in a matrix/data frame without knowing the column name:

mydata2[,ncol(mydata2)]

and if you want last n columns, try

mydata[,(ncol(mydata2)-n-1):ncol(mydata2)]

A little cumbersome, but works. Could write wrapper function if you plan to use it regularly.

like image 73
Spcogg Avatar answered Oct 02 '22 14:10

Spcogg


You could use something like this:

move_to_start <- function(x, to_move) {
  x[, c(to_move, setdiff(colnames(x), to_move))]
} 

move_to_start(mydata2, c('A', 'B'))

#   A B num1 num2
# 1 A B    1   36
# 2 A B    2   37
# 3 A B    3   38
# 4 A B    4   39
# 5 A B    5   40

Alternatively, if you want to move the last n columns to the start:

move_to_start <- function(x, n) {
  x[, c(tail(seq_len(ncol(x)), n), seq_len(ncol(x) - n))]
} 

move_to_start(mydata2, 2)

#   A B num1 num2
# 1 A B    1   36
# 2 A B    2   37
# 3 A B    3   38
# 4 A B    4   39
# 5 A B    5   40
like image 39
jbaums Avatar answered Sep 29 '22 14:09

jbaums


You can do a similar thing using the SOfun package, available on GitHub.

library(SOfun)

foo <- moveMe(colnames(mydata2), "A, B before num1")

mydata2[, foo]

#  A B num1 num2
#1 A B    1   36
#2 A B    2   37
#3 A B    3   38
#4 A B    4   39
#5 A B    5   40

You can move column names like this example from R Help.

x <- names(mtcars)

x
#[1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

moveMe(x, "hp first; cyl after drat; vs, am, gear before mpg; wt last")
#[1] "hp"   "vs"   "am"   "gear" "mpg"  "disp" "drat" "cyl"  "qsec" "carb" "wt" 
like image 39
jazzurro Avatar answered Sep 30 '22 14:09

jazzurro


Using the offset argument in the last_col function, inside select, you can do that.

Below is an example considering the last two columns, and it in a more generic approach.

library(dplyr)

mydata <- mydata %>% select(last_col(offset=c(0,1)), everything())

n <- 2
mydata <- mydata %>% select(last_col(offset=0:(n-1), everything())
like image 30
user3352480 Avatar answered Oct 01 '22 14:10

user3352480


data frames are just lists, so you can rearrange them as you would any list:

newdata <- c(mydata[colNamesToStart],
             mydata[-which(names(mydata) %in% colNamesToStart)])
like image 38
Jthorpe Avatar answered Oct 01 '22 14:10

Jthorpe


Another alternative with dplyr:

mydata2 <- select(mydata, 2:ncol(data),1)  
#select any cols from col2 until the last col and place them before col1
like image 32
Leonardo Avatar answered Sep 29 '22 14:09

Leonardo


I know this topic is a little dead, but wanted to chime in with a simple dplyr solution:

library(dplyr)

mydata <- mydata %>%
  select(A, B, everything())

If you are wanting to avoid explicit calls to the last columns, use seq() within last_col(). Let's denote the number of columns we wish to move to the front as n:

mydata <- mydata %>%
  select(
    last_col(seq(n - 1, 0)),
    everything()
  )
like image 33
Dave Gruenewald Avatar answered Oct 03 '22 14:10

Dave Gruenewald