Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to result the contents of a javascript variable using cheerio (jquery like selectors, but no dom)

There is a large html file with many javascript tags in it. I'm trying to scoop out the contents of that variable. The variable name stays the same but the contents change on every request.

examplefile.html

<script type="text/javascript">//.... more js</script>
<script type="text/javascript">//.... more js</script>
<script type="text/javascript">var foo = {"b":"bar","c":"cat"}</script>
<script type="text/javascript">//.... more js</script>
<script type="text/javascript">//.... more js</script>
<script type="text/javascript">//.... more js</script>

desired console result

> var result = $('script').<some_selection_thingy>
result = {"b":"bar","c":"cat"}

Let me explain a little bit... By I mean that my questions is - a) how do I select the array object with has the contents 'var foo' b) how do I get the contents of the var foo variable so that I can import that information into a local json variable for further processing.

when you run $('script') in the console, jquery returns an array.

> $('script')
[<script type="text/javascript">//.... more js</script>,<script type="text/javascript">//.... more js</script>,<script type="text/javascript">var foo = {"b":"bar","c":"cat"}</script>,<script type="text/javascript">...</script>]

Because this is cheerio not actually jquery, the dom isn't loaded so I can't just do $(foo) . There is an alternative that I can use jsdom instead of cheerio but I've read in other stackoverflow responses (while researching this question) that it's less performant so I'd prefer to learn the correct jquery selectors I need to scoop out this variable.

server.js

// some cheerio node code
url = 'someurl';
request(url, function(error, response, html){
    var $ = cheerio.load(html);
    result = $('script').map(&:text).select{ |s| s['var foo'] }
    result = result[0]
//SyntaxError: Unexpected token &

Which is of course expected because .map(&:text) is what I'd do if I was using xpath but doesn't work with cheerio (jquery).

like image 606
Falieson Avatar asked Feb 21 '15 22:02

Falieson


People also ask

How do you get elements in Cheerio?

Cheerio get element attributesAttributes can be retrieved with attr function. import fetch from 'node-fetch'; import { load } from 'cheerio'; const url = 'http://webcode.me'; const response = await fetch(url); const body = await response. text(); let $ = load(body); let lnEl = $('link'); let attrs = lnEl.

Can you use a variable in a JQuery selector?

Projects In JavaScript & JQueryYes, it is possible to pass a variable into a jQuery attribute-contains selector. The [attribute*=value] selector is used to select each element with a specific attribute and a value containing a string.

What is Cheerio in JavaScript?

Cheerio js is a Javascript technology used for web-scraping in server-side implementations. Web-scraping is a scripted method of extracting data from a website that can be tailored to your use-case. NodeJS is often used as the server-side platform.


1 Answers

I got it!

function findTextAndReturnRemainder(target, variable){
    var chopFront = target.substring(target.search(variable)+variable.length,target.length);
    var result = chopFront.substring(0,chopFront.search(";"));
    return result;
}
var text = $($('script')).text();
var findAndClean = findTextAndReturnRemainder(text,"var foo =");
var result = JSON.parse(findAndClean);
like image 60
Falieson Avatar answered Oct 24 '22 02:10

Falieson