Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract a html tag that contains a string in openrefine?

There is not much to add to the title. It's what i'm trying to do. Any suggestions?

I reviewed the docs at github and googled extensively.

The best i got is:

value.parseHtml().select('p[contains('xyz')]')

It results in a syntax error.

like image 933
treakec Avatar asked Jun 13 '15 09:06

treakec


1 Answers

The 'select' syntax is based on the select syntax in Beautiful Soup (http://jsoup.org/cookbook/extracting-data/selector-syntax)

In this case I believe the syntax you need is:

value.parseHtml().select("p:contains(xyz)")

Owen

like image 59
Owen Stephens Avatar answered Oct 04 '22 05:10

Owen Stephens