I have some HTML like this:
<span id="cod">Code:</span> <span>12345</span> <span>Category:</span> <span>faucets</span>
I want to fetch the category name ("faucets"). This is my trial:
var $ = cheerio.load(html.contents); var category = $('span[innerHTML="Category:"]').next().text();
But this doesn't work (the innerHTML
modifier does not select anything).
Any clue?
Cheerio js is a Javascript technology used for web-scraping in server-side implementations. Web-scraping is a scripted method of extracting data from a website that can be tailored to your use-case. NodeJS is often used as the server-side platform.
Cheerio get element attributesAttributes can be retrieved with attr function. import fetch from 'node-fetch'; import { load } from 'cheerio'; const url = 'http://webcode.me'; const response = await fetch(url); const body = await response. text(); let $ = load(body); let lnEl = $('link'); let attrs = lnEl.
jQuery :contains() SelectorThe :contains() selector selects elements containing the specified string. The string can be contained directly in the element as text, or in a child element. This is mostly used together with another selector to select the elements containing the text in a group (like in the example above).
The reason your code isn't working is because [innerHTML]
is an attribute selector, and innerHTML
isn't an attribute on the element (which means that nothing is selected).
You could filter the span
elements based on their text. In the example below, .trim()
is used to trim off any whitespace. If the text equals 'Category:', then the element is included in the filtered set of returned elements.
var category = $('span').filter(function() { return $(this).text().trim() === 'Category:'; }).next().text();
The above snippet will filter elements if their text is exactly 'Category:'. If you want to select elements if their text contains that string, you could use the :contains
selector (as pointed out in the comments):
var category = $('span:contains("Category:")').next().text();
Alternatively, using the .indexOf()
method would work as well:
var category = $('span').filter(function() { return $(this).text().indexOf('Category:') > -1; }).next().text();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With