I have a dataframe where some of the values are NULL or Empty. I would like to remove these columns in which all values are NULL or empty. Columns should be removed from the dataframe, do not hidden only. My head(df) looks like data= <pre class="prettyprint"><code> VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 1 2R+ 52 1.05 0 0 30 2 2R+ 169 1.02 0 0 40 3 2R+ 83 NA 0 0 40 4 2R+ 98 1.16 0 0 40 5 2R+ 154 1.11 0 0 40 6 2R+ 111 NA 0 0 15 </code></pre> The dataframe contains more than 200 variables, variables are empty and zero values do not occur sequentially. I tried to estimate the average Col and select the column is Null or empty, by analogy with the removal of "NA" (see here), but it does not work. <pre class="prettyprint"><code>df <- df[,colSums(is.na(df))<nrow(df)] </code></pre> I got an error : 'x' must be an array of at least two dimensions Can anyone give me some help? Thanks!

We can use <code>Filter</code> <pre class="prettyprint"><code>Filter(function(x) !(all(x=="")), df) # Var1 Var3 #1 2R+ 52 #2 2R+ 169 #3 2R+ 83 #4 2R+ 98 #5 2R+ NA #6 2R+ 111 #7 2R+ 94 #8 2R+ 116 #9 2R+ 86 </code></pre> NOTE: It should also work if all the elements are NA for a particular column <pre class="prettyprint"><code>df$Var3 <- NA Filter(function(x) !(all(x=="")), df) # Var1 #1 2R+ #2 2R+ #3 2R+ #4 2R+ #5 2R+ #6 2R+ #7 2R+ #8 2R+ #9 2R+ </code></pre> <h3>Update</h3> Based on the updated dataset, if we need to remove the columns with only 0 values, then change the code to <pre class="prettyprint"><code>Filter(function(x) !(all(x==""|x==0)), df2) # VAR1 VAR3 VAR4 VAR7 #1 2R+ 52 1.05 30 #2 2R+ 169 1.02 40 #3 2R+ 83 NA 40 #4 2R+ 98 1.16 40 #5 2R+ 154 1.11 40 #6 2R+ 111 NA 15 </code></pre> <h3>data</h3> <pre class="prettyprint"><code>df2 <- structure(list(VAR1 = c("2R+", "2R+", "2R+", "2R+", "2R+", "2R+" ), VAR2 = c("", "", "", "", "", ""), VAR3 = c(52L, 169L, 83L, 98L, 154L, 111L), VAR4 = c(1.05, 1.02, NA, 1.16, 1.11, NA), VAR5 = c(0L, 0L, 0L, 0L, 0L, 0L), VAR6 = c(0L, 0L, 0L, 0L, 0L, 0L), VAR7 = c(30L, 40L, 40L, 40L, 40L, 15L)), .Names = c("VAR1", "VAR2", "VAR3", "VAR4", "VAR5", "VAR6", "VAR7"), row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame") </code></pre>

Remove columns from dataframe where ALL values are NA, NULL or empty [duplicate]

Tags:

I have a dataframe where some of the values are NULL or Empty. I would like to remove these columns in which all values are NULL or empty. Columns should be removed from the dataframe, do not hidden only.

My head(df) looks like data=

  VAR1  VAR2  VAR3   VAR4  VAR5  VAR6  VAR7
1  2R+          52   1.05     0     0    30
2  2R+         169   1.02     0     0    40
3  2R+          83     NA     0     0    40
4  2R+          98   1.16     0     0    40
5  2R+         154   1.11     0     0    40
6  2R+         111     NA     0     0    15

The dataframe contains more than 200 variables, variables are empty and zero values do not occur sequentially.

I tried to estimate the average Col and select the column is Null or empty, by analogy with the removal of "NA" (see here), but it does not work.

df <- df[,colSums(is.na(df))<nrow(df)]

I got an error : 'x' must be an array of at least two dimensions

Can anyone give me some help? Thanks!

779

asked Jan 27 '17 12:01

Denis Efimov

1 Answers

We can use Filter

Filter(function(x) !(all(x=="")), df)
#   Var1 Var3
#1  2R+   52
#2  2R+  169
#3  2R+   83
#4  2R+   98
#5  2R+   NA
#6  2R+  111
#7  2R+   94
#8  2R+  116
#9  2R+   86

NOTE: It should also work if all the elements are NA for a particular column

df$Var3 <- NA
Filter(function(x) !(all(x=="")), df)
#   Var1
#1  2R+
#2  2R+
#3  2R+
#4  2R+
#5  2R+
#6  2R+
#7  2R+
#8  2R+
#9  2R+

Update

Based on the updated dataset, if we need to remove the columns with only 0 values, then change the code to

Filter(function(x) !(all(x==""|x==0)), df2)
#    VAR1 VAR3 VAR4 VAR7
#1  2R+   52 1.05   30
#2  2R+  169 1.02   40
#3  2R+   83   NA   40
#4  2R+   98 1.16   40
#5  2R+  154 1.11   40
#6  2R+  111   NA   15

data

df2 <- structure(list(VAR1 = c("2R+", "2R+", "2R+", "2R+", "2R+", "2R+"
), VAR2 = c("", "", "", "", "", ""), VAR3 = c(52L, 169L, 83L, 
98L, 154L, 111L), VAR4 = c(1.05, 1.02, NA, 1.16, 1.11, NA), VAR5 = c(0L, 
0L, 0L, 0L, 0L, 0L), VAR6 = c(0L, 0L, 0L, 0L, 0L, 0L), VAR7 = c(30L, 
40L, 40L, 40L, 40L, 15L)), .Names = c("VAR1", "VAR2", "VAR3", 
"VAR4", "VAR5", "VAR6", "VAR7"), row.names = c("1", "2", "3", 
"4", "5", "6"), class = "data.frame")

113

answered Sep 24 '22 10:09

akrun

Related questions
                            
                                Is there a way to restrict value of a column in Entity Framework?
                            
                                Browser extensions: Send messages (with response) between browser-action-popup and background-script
                            
                                Angular2 testing: Error when trying to use: @angular/platform-browser/testing/browser_util (Karma configurations)
                            
                                Does waiting on a condition variable load the CPU core?
                            
                                ConstraintLayout stable release?
                            
                                Interface and implementation design structure?
                            
                                Webpack 2 configuration for Tree shaking and lazy loading with System.import on React project
                            
                                How to merge the list in `defaultdict` using keys, but keep the list separate within that key?
                            
                                ConcurrentHashMap in Java locking mechanism for computeIfPresent
                            
                                Python Download file with Pandas / Urllib
                            
                                Naming abstract classes and Interfaces in TypeScript [closed]
                            
                                pop() is not a function - nodejs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With