I found out the positions of "oo" in the following sentence:
sentence <- "It is a good book. Good for first reading.
This book explains everything in Qdetail with tons of examples and exercises for practice. Good for cracking written tests on campuses and competitive exams. It is cheap so any way one can have a copy along with other books"
pos = gregexpr("oo", sentence)
I got the result as
> pos
[[1]]
[1] 10 15 21 50 136 263
attr(,"match.length")
[1] 2 2 2 2 2 2
attr(,"useBytes")
[1] TRUE
Based on the result, I want to extract 10 characters from each position (5 before the position and 5 after the position)
For an example, I should get result for first location as "s a good bo" And I want this extraction for each and every position. As I am new to R I couldn't figure out much how to do. Please help me out with this.
What should I do if I have to extract the words like it is shown below: I should get "a good book" for the first instance of matching
The substring function in R can be used either to extract parts of character strings, or to change the values of parts of character strings. substring of a vector or column in R can be extracted using substr() function. To extract the substring of the column in R we use functions like substr() and substring().
The indexOf() method returns the position of the first occurrence of specified character(s) in a string. Tip: Use the lastIndexOf method to return the position of the last occurrence of specified character(s) in a string.
To extract words from a string vector, we can use word function of stringr package. For example, if we have a vector called x that contains 100 words then first 20 words can be extracted by using the command word(x,start=1,end=20,sep=fixed(" ")).
We can use substring
after unlist
ing the gregexpr
output.
v1 <- unlist(gregexpr("oo", sentence))
substring(sentence, v1 - 5, v1 +5)
#[1] "s a good bo" "ood book. G" "ok. Good fo" "his book ex" "ce. Good fo" "her books"
You could also do
mapply(
substr,
x=sentence,
start=pos[[1]]-5,
stop=pos[[1]]+5,
USE.NAMES = F
)
# [1] "s a good bo" "ood book. G" "ok. Good fo"
# [4] "his book ex" "ce. Good fo" "her books"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With