Need to extract whole sentences where middle word begins with a specific word in R

Question

I need to extract whole sentences where middle word begins with a specific word in R. Below is the code which i am trying to use but not able to get the desired result. I am new to regular expression concept in R. I want to extract the sentences where middle word is 'arent'.

  yy <- c("computers arent working", "arent not wkng","scanner arent good","arent scanner good")
  m <- gregexpr('\w arent ', yy)
  regmatches(yy, m)

Above code does not gives what i want. My desired output is:

 "computers arent working", "scanner arent good"

Thanks for your help!

Wiktor Stribiżew · Accepted Answer

I suggest

grep("\w\W+arent\W+\w", yy, value = TRUE)

grep will find all the strings that match the regex pattern (where a partial match is found), and will output the values themselves (as value is set to TRUE).

The regex pattern matches arent in-between word (\w) chars and only enclosed with 1+ non-word (\W+) chars.

Online R demo:

yy <- c("computers arent working", "arent not wkng","scanner arent good","arent scanner good")
grep("\w\W+arent\W+\w", yy, value = TRUE)
## => [1] "computers arent working" "scanner arent good"

If the word you seek to match MUST be enclosed with whitespace, replace \W+ with \s+ (1 or more whitespaces).

Need to extract whole sentences where middle word begins with a specific word in R

Tags:

regex

r

Kiwi

1 Answers

Wiktor Stribiżew

Recent Activity

Donate For Us

Need to extract whole sentences where middle word begins with a specific word in R

Tags:

regex

r

Kiwi

1 Answers

Wiktor Stribiżew

Related questions

Recent Activity

Donate For Us