Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

filter empty rows from a dataframe with R

Tags:

r

I have a dataframe with this structure :

Note.Reco  Reason.Reco  Suggestion.Reco  Contact
9          absent       tomorrow          yes
8                       tomorrow          yes
8          present      today             no
5                       yesterday         no

I would like to delete from this dataframe all the rows which have an empty value.

The expected result :

 Note.Reco  Reason.Reco  Suggestion.Reco  Contact
  9          absent       tomorrow          yes
  8          present      today             no

I try with this r instruction :

IRC_DF[!(is.na(IRC_DF$Reason.Reco) | IRC_DF$Reason.Reco==" "), ]

But I get the same input dataframe

Any idea please?

thank you

like image 604
Datackatlon Avatar asked Mar 10 '17 15:03

Datackatlon


People also ask

How do I remove blank rows from a value in R?

These blanks are actually inserted by using space key on computers. Therefore, if a data frame has any column with blank values then those rows can be removed by using subsetting with single square brackets.

How do I filter rows containing certain text in R?

Often you may want to filter rows in a data frame in R that contain a certain string. Fortunately this is easy to do using the filter() function from the dplyr package and the grepl() function in Base R.


2 Answers

We need to change the syntax to

IRC_DF[!(!is.na(IRC_DF$Reason.Reco) & IRC_DF$Reason.Reco==""), ]
#   Note.Reco Reason.Reco Suggestion.Reco Contact
#1         9      absent        tomorrow     yes
#3         8     present           today      no

If multiple columns have NA or blanks (""), then

IRC_DF[Reduce(`&`, lapply(IRC_DF, function(x) !(is.na(x)|x==""))),]

data

IRC_DF <- structure(list(Note.Reco = c(9L, 8L, 8L, 5L), Reason.Reco = c("absent", 
 "", "present", ""), Suggestion.Reco = c("tomorrow", "tomorrow", 
 "today", "yesterday"), Contact = c("yes", "yes", "no", "no")), .Names = c("Note.Reco", 
 "Reason.Reco", "Suggestion.Reco", "Contact"), class = "data.frame", row.names = c(NA, 
 -4L))
like image 196
akrun Avatar answered Oct 12 '22 17:10

akrun


Or use dplyr's filter function.

filter(IRC_DF, !is.na(Reason.Reco) | Reason.Reco != "")
like image 37
Laurent Avatar answered Oct 12 '22 18:10

Laurent