I have a df:
df <- c("hello goodbye Delete Me", "Another Sentence good program", "hello world The End")
I want this:
c("hello goodbye", "good program", "hello world")
I have tried:
df <- grep("^[A-Z]", df, invert = TRUE, value = TRUE)
but this deletes the entire character that starts with a capital letter:
c("hello goodbye Delete Me", "hello world The End")
How do I do this?
You can use -
trimws(gsub('[A-Z]\\w+', '', df))
#[1] "hello goodbye" "good program" "hello world"
You may use the following regex pattern, and then replace with just a single space:
\s*[A-Z]\w+\s*
This will capture all words beginning with capital letters, along with any whitespace which might appear on either side. The outer call to trimws()
is there to remove any spaces which might remain at the very start or end, as a leftover of the replacement logic.
x <- c("nice to meet You however", "cat Ran away", "Cat", "Dog")
trimws(gsub('\\s*[A-Z]\\w+\\s*', ' ', x))
[1] "nice to meet however" "cat away" ""
[4] ""
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With