I have a data frame where all the variables are of character type. Many of the columns are completely empty, i.e. only the variable headers are there, but no values. Is there any way to subset out the empty columns?
We can delete multiple columns in the R dataframe by assigning null values through the list() function.
To remove all empty columns from an R data frame with the discard() function, you only need to identify the empty columns. You can recognize them with the all() function. Once you have identified them, the discard() function removes them automatically.
To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.
To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.
If your empty columns are really empty character columns, something like the following should work. It will need to be modified if your "empty" character columns include, say, spaces.
Sample data:
mydf <- data.frame( A = c("a", "b"), B = c("y", ""), C = c("", ""), D = c("", ""), E = c("", "z") ) mydf # A B C D E # 1 a y # 2 b z
Identifying and removing the "empty" columns.
mydf[!sapply(mydf, function(x) all(x == ""))] # A B E # 1 a y # 2 b z
Alternatively, as recommended by @Roland:
> mydf[, colSums(mydf != "") != 0] A B E 1 a y 2 b z
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With