Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Cheerio: How to select element by text content?


I have some HTML like this:

<span id="cod">Code:</span> <span>12345</span> <span>Category:</span> <span>faucets</span> 

I want to fetch the category name ("faucets"). This is my trial:

var $ = cheerio.load(html.contents); var category = $('span[innerHTML="Category:"]').next().text(); 

But this doesn't work (the innerHTML modifier does not select anything).

Any clue?

like image 332
MarcoS Avatar asked Jan 10 '16 19:01


People also ask

What is Cheeriojs?

Cheerio js is a Javascript technology used for web-scraping in server-side implementations. Web-scraping is a scripted method of extracting data from a website that can be tailored to your use-case. NodeJS is often used as the server-side platform.

How do you get elements in Cheerio?

Cheerio get element attributesAttributes can be retrieved with attr function. import fetch from 'node-fetch'; import { load } from 'cheerio'; const url = 'http://webcode.me'; const response = await fetch(url); const body = await response. text(); let $ = load(body); let lnEl = $('link'); let attrs = lnEl.

How use contains in jQuery?

jQuery :contains() SelectorThe :contains() selector selects elements containing the specified string. The string can be contained directly in the element as text, or in a child element. This is mostly used together with another selector to select the elements containing the text in a group (like in the example above).

1 Answers

The reason your code isn't working is because [innerHTML] is an attribute selector, and innerHTML isn't an attribute on the element (which means that nothing is selected).

You could filter the span elements based on their text. In the example below, .trim() is used to trim off any whitespace. If the text equals 'Category:', then the element is included in the filtered set of returned elements.

var category = $('span').filter(function() {   return $(this).text().trim() === 'Category:'; }).next().text(); 

The above snippet will filter elements if their text is exactly 'Category:'. If you want to select elements if their text contains that string, you could use the :contains selector (as pointed out in the comments):

var category = $('span:contains("Category:")').next().text(); 

Alternatively, using the .indexOf() method would work as well:

var category = $('span').filter(function() {   return $(this).text().indexOf('Category:') > -1; }).next().text(); 
like image 152
Josh Crozier Avatar answered Nov 04 '22 13:11

Josh Crozier