I have a data frame of participant questionnaire responses in wide format, with each column representing a particular question/item. The data frame looks something like this: <pre class="prettyprint"><code>id <- c(1, 2, 3, 4) Q1 <- c(NA, NA, NA, NA) Q2 <- c(1, "", 4, 5) Q3 <- c(NA, 2, 3, 4) Q4 <- c("", "", 2, 2) Q5 <- c("", "", "", "") df <- data.frame(id, Q1, Q2, Q3, Q4, Q5) </code></pre> I want R to remove columns that has all values in each of its rows that are either (1) NA or (2) blanks. Therefore, I do not want column Q1 (which comprises entirely of NAs) and column Q5 (which comprises entirely of blanks in the form of ""). According to this thread, I am able to use the following to remove columns that comprise entirely of NAs: <pre class="prettyprint"><code>df[, !apply(is.na(df), 2, all] </code></pre> However, that solution does not address blanks (""). As I am doing all of this in a dplyr pipe, could someone also explain how I could incorporate the above code into a dplyr pipe? At this moment, my dplyr pipe looks like the following: <pre class="prettyprint"><code>df <- df %>% select(relevant columns that I need) </code></pre> After which, I'm stuck here and am using the brackets [] to subset the non-NA columns. Thanks! Much appreciated.

We can use a version of <code>select_if</code> <pre class="prettyprint"><code>library(dplyr) df %>% select_if(function(x) !(all(is.na(x)) | all(x==""))) # id Q2 Q3 Q4 #1 1 1 NA #2 2 2 #3 3 4 3 2 #4 4 5 4 2 </code></pre> Or without using an anonymous function call <pre class="prettyprint"><code>df %>% select_if(~!(all(is.na(.)) | all(. == ""))) </code></pre> <hr> You can also modify your <code>apply</code> statement as <pre class="prettyprint"><code>df[!apply(df, 2, function(x) all(is.na(x)) | all(x==""))] </code></pre> Or using <code>colSums</code> <pre class="prettyprint"><code>df[colSums(is.na(df) | df == "") != nrow(df)] </code></pre> and inverse <pre class="prettyprint"><code>df[colSums(!(is.na(df) | df == "")) > 0] </code></pre>

Piping the removal of empty columns using dplyr

Tags:

r

dplyr

I have a data frame of participant questionnaire responses in wide format, with each column representing a particular question/item.

The data frame looks something like this:

id <- c(1, 2, 3, 4)
Q1 <- c(NA, NA, NA, NA)
Q2 <- c(1, "", 4, 5)
Q3 <- c(NA, 2, 3, 4)
Q4 <- c("", "", 2, 2)
Q5 <- c("", "", "", "")
df <- data.frame(id, Q1, Q2, Q3, Q4, Q5)

I want R to remove columns that has all values in each of its rows that are either (1) NA or (2) blanks. Therefore, I do not want column Q1 (which comprises entirely of NAs) and column Q5 (which comprises entirely of blanks in the form of "").

According to this thread, I am able to use the following to remove columns that comprise entirely of NAs:

df[, !apply(is.na(df), 2, all]

However, that solution does not address blanks (""). As I am doing all of this in a dplyr pipe, could someone also explain how I could incorporate the above code into a dplyr pipe?

At this moment, my dplyr pipe looks like the following:

df <- df %>%
    select(relevant columns that I need)

After which, I'm stuck here and am using the brackets [] to subset the non-NA columns.

Thanks! Much appreciated.

243

asked Mar 20 '18 01:03

DTYK

1 Answers

We can use a version of select_if

library(dplyr)
df %>%
   select_if(function(x) !(all(is.na(x)) | all(x=="")))

#  id Q2 Q3 Q4
#1  1  1 NA   
#2  2     2   
#3  3  4  3  2
#4  4  5  4  2

Or without using an anonymous function call

df %>% select_if(~!(all(is.na(.)) | all(. == "")))

You can also modify your apply statement as

df[!apply(df, 2, function(x) all(is.na(x)) | all(x==""))]

Or using colSums

df[colSums(is.na(df) | df == "") != nrow(df)]

and inverse

df[colSums(!(is.na(df) | df == "")) > 0]

200

answered Sep 28 '22 08:09

Ronak Shah

Related questions
                            
                                Remove rows in dataframe with factor ""
                            
                                R can't convert NaN to NA
                            
                                Converting a factor with 2 levels to binary values 0/1 in R [closed]
                            
                                R list get first item of each element
                            
                                Calculate Percentage Change in R using dplyr
                            
                                How to name the list of the group_split output in dplyr
                            
                                How can I revise my code to improve my processing speed
                            
                                Replace all values in a data.table given a condition
                            
                                removing a list of columns from a data.frame using subset [duplicate]
                            
                                How to save a graph as an a4 size pdf file under windows system? (R; ggplot2)
                            
                                R: adding 1 month to a date
                            
                                How to delete groups containing less than 3 rows of data in R? [duplicate]
                            
                                how insert zeros in seq in R
                            
                                Reduced row echelon form
                            
                                algorithm to round to the next order of magnitude in R
                            
                                How to overlay a line for an lm object on a ggplot2 scatterplot
                            
                                How to sort a matrix by all columns
                            
                                How to convert a vector of strings to Title Case
                            
                                Error in RShiny ui.r argument missing [closed]
                            
                                How to find the percentage of NAs in a data.frame?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With