Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get all columns with the same column name in R at once?

Tags:

r

Let's say I have the following data frame:

> test <- cbind(test=c(1, 2, 3), test=c(1, 2, 3))
> test
     test test
[1,]    1    1
[2,]    2    2
[3,]    3    3

Now from such data frame I want to fetch all the columns named "test" to a new data frame:

> new_df <- test[, "test"]

However this last attempt to do so only fetches the first column called "test" in test data frame:

> new_df
[1] 1 2 3

How can I get all of the columns called "test" in this example and put them into a new data frame in a single command? In my real data I have many columns with repeated colnames and I don't know the index of the columns so I can`t get them by number.

like image 308
MikeKatz45 Avatar asked Sep 23 '19 01:09

MikeKatz45


1 Answers

It is not advisable to have same column names for practical reasons. But, we can do a comparison (==) to get a logical vector and use that to extract the columns

i1 <- colnames(test) == "test"
new_df <- test[, i1, drop = FALSE]

Note that data.frame doesn't allow duplicate column names and would change it to unique by appending .1 .2 etc at the end with make.unique. With matrix (the OP's dataset), allows to have duplicate column names or row names (not recommended though)


Also, if there are multiple column names that are repeated and want to select them as separate datasets, use split

lst1 <- lapply(split(seq_len(ncol(test)), colnames(test)), function(i)
            test[, i, drop = FALSE])

Or loop through the unique column names and do a == by looping through it with lapply

lst2 <- lapply(unique(colnames(test)), function(nm) 
             test[, colnames(test) == nm, drop = FALSE])
like image 189
akrun Avatar answered Oct 18 '22 19:10

akrun