regular expression in R for word of variable length between two characters

Question

How do I extract the word, wordofvariablelength from the string below.

<a href=\"http://www.adrive.com/browse/wordofvariablelength\" class=\"next-button\" id=\"explore-gutter\" data-linkid=\"huiazc\"> <strong class=\"text gutter-text \">

I was able to get the first part of the string using the below code, but is there a regular expression I can use to get only the word immediately after "browse/" and before "\", which here is the word, "wordofvariablelength" using the code below

mystring = substr(mystring,nchar("<a href=\"http://www.thesaurus.com/browse/")+1,nchar("<a href=\"http://www.thesaurus.com/browse/")+20)

Note that the word, wordofvariablelength could be of any length, and so I cannot hardcode and start and end

Avinash Raj · Accepted Answer

Through regmatches function.

> x <- "<a href=\"http://www.adrive.com/browse/wordofvariablelength\" class=\"next-button\" id=\"explore-gutter\" data-linkid=\"huiazc\"> <strong class=\"text gutter-text \">"
> regmatches(x, regexpr('.*?"[^"]*/\K[^/"]*(?=")', x, perl=TRUE))
[1] "wordofvariablelength"

OR

> regmatches(x, regexpr('[^/"]*(?="\s+class=")', x, perl=TRUE))
[1] "wordofvariablelength"

OR

Much more simpler one using gsub.

> gsub('.*/|".*', "", x)
[1] "wordofvariablelength"

regular expression in R for word of variable length between two characters

Tags:

regex

r

gsub

tubby

1 Answers

Avinash Raj

Recent Activity

Donate For Us

regular expression in R for word of variable length between two characters

Tags:

regex

r

gsub

tubby

1 Answers

Avinash Raj

Related questions

Recent Activity

Donate For Us