How to remove columns with same value in R

Tags:

r

In short:

I want to do this with my table,

enter image description here

Explanation:

I have big table with 20,000 x 1,200 items. I want to remove all the columns which have all the values same from top to bottom. But it shouldn't change the variable name(V2 in the example) so that later I can figure out which one them is removed.

538

asked May 30 '15 09:05

Dev

2 Answers

Just use vapply to go through and check how many unique values there are in each column:

Sample data:

mydf <- data.frame(v1 = 1:4, v2 = 5:8,
                   v3 = 2, v4 = 9:12, v5 = 1)
mydf
##   v1 v2 v3 v4 v5
## 1  1  5  2  9  1
## 2  2  6  2 10  1
## 3  3  7  2 11  1
## 4  4  8  2 12  1

What we will be doing with vapply:

vapply(mydf, function(x) length(unique(x)) > 1, logical(1L))
#    v1    v2    v3    v4    v5 
#  TRUE  TRUE FALSE  TRUE FALSE

Keep the columns you want:

mydf[vapply(mydf, function(x) length(unique(x)) > 1, logical(1L))]
#   v1 v2 v4
# 1  1  5  9
# 2  2  6 10
# 3  3  7 11
# 4  4  8 12

197

answered Sep 30 '22 21:09

A5C1D2H2I1M1N2O1R2T1

In case someone tries to do this with dplyr, this yet another way to do it:

library(dplyr)
mydf %>% select(where(~n_distinct(.) > 1))

answered Sep 30 '22 21:09

zeehio

Related questions
                            
                                Changing the maximum width of R markdown documents
                            
                                ggplot2 axis transformation by constant factor
                            
                                Regex return file name, remove path and file extension
                            
                                Hollow histogram or binning for geom_step
                            
                                Create grouping variable for consecutive sequences and split vector
                            
                                Using nnet for prediction, am i doing it right?
                            
                                What does the double percentage sign (%%) mean?
                            
                                lib unspecified & Error in loadNamespace
                            
                                Create a histogram for weighted values
                            
                                Using R from Scala and invoking Scala from R?
                            
                                print or display variable inside function
                            
                                How to create base R plot 'type = b' equivalent in ggplot2?
                            
                                dplyr group by colnames described as vector of strings
                            
                                Replace column names in kable/R markdown
                            
                                What does c do in R? [duplicate]
                            
                                r modify and rebuild package
                            
                                How do I show all boxplot labels
                            
                                R: how to check whether a vector is ascending/descending
                            
                                Convert and save distance matrix to a specific format
                            
                                visualize a list of colors/palette in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With