I have a dataframe such as this one:
d <- data.frame(cbind(x=1, y=1:10, z=c("apple","pear","banana","A","B","C","D","E","F","G")), stringsAsFactors = FALSE)
I'd like to delete some rows from this dataframe, depending on the content of column z:
new_d <- d[-grep("D",d$z),]
This works fine; row 7 is now deleted:
new_d
x y z
1 1 1 apple
2 1 2 pear
3 1 3 banana
4 1 4 A
5 1 5 B
6 1 6 C
8 1 8 E
9 1 9 F
10 1 10 G
However, when I use grep to search for content that is not present in column z, it seems to delete all content of the dataframe:
new_d <- d[-grep("K",d$z),]
new_d
[1] x y z
<0 rows> (or 0-length row.names)
I would like to search and delete rows in this or another way, even if the character string I am searching for is not present. How to go about this?
To remove the rows in R, use the subsetting in R. There is no built-in function of removing a row from the data frame, but you can access a data frame without some rows specified by the negative index. This process is also called subsetting. This way, you can remove unwanted rows from the data frame.
To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.
R: Remove Rows from Data Frame Based on Condition You can use the subset () function to remove rows with certain values in a data frame in R: #only keep rows where col1 value is less than 10 and col2 value is less than 8 new_df <- subset (df, col1<10 & col2<8)
The grepl function in R search for matches to argument pattern within each element of a character vector or column of an R data frame. If we want to subset rows of an R data frame using grepl then subsetting with single-square brackets and grepl can be used by accessing the column that contains character values. Consider the below data frame:
The following code shows how to delete multiple data frames from your current R workspace: The following code shows how to delete all objects that are of type “data.frame” in your current R workspace: You can also use the grepl () function to delete all objects in the workspace that contain the phrase “df”:
You also have the option of using rbind to add multiple rows at once – or even combine two R data frames. If you want to add rows this way, the two data frames need to have the same number of columns.
You can use TRUE/FALSE subsetting instead of numeric.
grepl
is like grep, but it returns a logical
vector. Negation works with it.
d[!grepl("K",d$z),]
x y z
1 1 1 apple
2 1 2 pear
3 1 3 banana
4 1 4 A
5 1 5 B
6 1 6 C
7 1 7 D
8 1 8 E
9 1 9 F
10 1 10 G
Here's your problem:
> grep("K",c("apple","pear","banana","A","B","C","D","E","F","G"))
integer(0)
Try grepl() instead:
d[!grepl("K",d$z),]
This works because the negated logical vector has an entry for every row:
> grepl("K",d$z)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> !grepl("K",d$z)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With