Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete rows with blank values in one particular column

I am working on a large dataset, with some rows with NAs and others with blanks:

df <- data.frame(ID = c(1:7),                                             home_pc = c("","CB4 2DT", "NE5 7TH", "BY5 8IB", "DH4 6PB","MP9 7GH","KN4 5GH"),                         start_pc = c(NA,"Home", "FC5 7YH","Home", "CB3 5TH", "BV6 5PB",NA),                         end_pc = c(NA,"CB5 4FG","Home","","Home","",NA)) 

How do I remove the NAs and blanks in one go (in the start_pc and end_pc columns)? I have in the past used:

df<- df[-which(is.na(df$start_pc)), ] 

... to remove the NAs - is there a similar command to remove the blanks?

like image 249
KT_1 Avatar asked Feb 03 '12 10:02

KT_1


People also ask

How do you delete a row with blank cells in a column?

To delete these blank cells, right-click anywhere in the selected range. Then, in the drop-down menu, click Delete and choose Table Rows.

How do you delete all rows with a certain value in a column?

Go ahead to right click selected cells and select the Delete from the right-clicking menu. And then check the Entire row option in the popping up Delete dialog box, and click the OK button. Now you will see all the cells containing the certain value are removed.

How do you multiple remove blank rows in Excel?

To delete multiple contiguous blank rows using a keyboard shortcut: Drag across the row headings using a mouse or select the first row heading and then Shift-click the last row heading. Press Ctrl + – (minus sign at the top right of the keyboard) to delete the selected rows.


2 Answers

 df[!(is.na(df$start_pc) | df$start_pc==""), ] 
like image 188
sgibb Avatar answered Sep 30 '22 09:09

sgibb


It is the same construct - simply test for empty strings rather than NA:

Try this:

df <- df[-which(df$start_pc == ""), ] 

In fact, looking at your code, you don't need the which, but use the negation instead, so you can simplify it to:

df <- df[!(df$start_pc == ""), ] df <- df[!is.na(df$start_pc), ] 

And, of course, you can combine these two statements as follows:

df <- df[!(df$start_pc == "" | is.na(df$start_pc)), ] 

And simplify it even further with with:

df <- with(df, df[!(start_pc == "" | is.na(start_pc)), ]) 

You can also test for non-zero string length using nzchar.

df <- with(df, df[!(nzchar(start_pc) | is.na(start_pc)), ]) 

Disclaimer: I didn't test any of this code. Please let me know if there are syntax errors anywhere

like image 36
Andrie Avatar answered Sep 30 '22 09:09

Andrie