Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting 5 words before and after an specific word

Tags:

r

How can I extract the words/sentence next to an specific word? Example:

"On June 28, Jane went to the cinema and ate popcorn"

I would like to choose 'Jane' and get [-2,2], meaning:

"June 28, Jane went to"

like image 673
Ivancito Avatar asked Oct 17 '25 10:10

Ivancito


2 Answers

I have a shorter version using str_extract from stringr

library(stringr)
txt <- "On June 28, Jane went to the cinema and ate popcorn"
str_extract(txt,"([^\\s]+\\s+){2}Jane(\\s+[^\\s]+){2}")

[1] "June 28, Jane went to"

The function str_extract extract the patern form the string. The regex \\s is for white space, and [^\\s] is the negation of it, so anything but white space. So the whole pattern is Jane with twice a white space before and after and something composed by anything but whitespace

The advantage is that it is already vectorized, and if you have a vector of text you can use str_extract_all:

s <- c("On June 28, Jane went to the cinema and ate popcorn. 
          The next day, Jane hiked on a trail.",
       "an indeed Jane loved it a lot")

str_extract_all(s,"([^\\s]+\\s+){2}Jane(\\s+[^\\s]+){2}")

[[1]]
[1] "June 28, Jane went to"   "next day, Jane hiked on"

[[2]]
[1] "an indeed Jane loved it"
like image 89
denis Avatar answered Oct 20 '25 00:10

denis


We could make a function to help out. This might make it a little more dynamic.

library(tidyverse)

txt <- "On June 28, Jane went to the cinema and ate popcorn"

grab_text <- function(text, target, before, after){
  min <- which(unlist(map(str_split(text, "\\s"), ~grepl(target, .x))))-before
  max <- which(unlist(map(str_split(text, "\\s"), ~grepl(target, .x))))+after

  paste(str_split(text, "\\s")[[1]][min:max], collapse = " ")
}

grab_text(text = txt, target = "Jane", before = 2, after  = 2)
#> [1] "June 28, Jane went to"

First we split the sentence, then we figure out the position of the target, then we grab any word before or after (number specified in the function), last we collapse the sentence back together.

like image 34
AndS. Avatar answered Oct 20 '25 01:10

AndS.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!