I am trying to get a value from inside page.evaluate()
body in my YouTube scraper that I've built using Puppeteer. I am unable to return the result from page.evaluate()
. How do I achieve this? Here's the code:
let boxes2 = []
const getData = async() => {
return await page.evaluate(async () => { // scroll till there's no more room to scroll or you get at least 250 boxes
console.log(await new Promise(resolve => {
var scrolledHeight = 0
var distance = 100
var timer = setInterval(() => {
boxes = document.querySelectorAll("div.style-scope.ytd-item-section-renderer#contents > ytd-video-renderer > div.style-scope.ytd-video-renderer#dismissable")
console.log(`${boxes.length} boxes`)
var scrollHeight = document.documentElement.scrollHeight
window.scrollBy(0, distance)
scrolledHeight += distance
if(scrolledHeight >= scrollHeight || boxes.length >= 50){
clearInterval(timer)
resolve(Array.from(boxes))
}
}, 500)
}))
})
}
boxes2 = await getData()
console.log(boxes2)
The console.log
wrapping the promise prints the resulting array in the browser's console. I just cannot get that array in boxes2
down where I'm calling the getData()
function.
I feel like I'm missing out on a tiny little bit, but can't figure out what it is. Appreciate any tip here.
The little issue is that you don't actually return the data from inside of page.evaluate:
const getData = () => {
return page.evaluate(async () => {
return await new Promise(resolve => { // <-- return the data to node.js from browser
// scraping
}))
})
}
And here's a full minimal working example for puppeteer that will print array [ 1, 2, 3 ]
:
const puppeteer = require('puppeteer');
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
boxes2 = [];
const getData = async() => {
return await page.evaluate(async () => {
return await new Promise(resolve => {
setTimeout(() => {
resolve([1,2,3]);
}, 3000)
})
})
}
boxes2 = await getData();
console.log(boxes2)
await browser.close();
});
To get parameters to work with a result back to here is what you need.
const results = await page.evaluate(new Function('name', "return new Promise(resolve => {resolve('done')});"), name);
let videoURLs = await page.evaluate(async () => { // scroll till there's no more room to scroll or you get at least 250 boxes
return await new Promise(resolve => {
var scrolledHeight = 0
var distance = 100
var timer = setInterval(() => {
boxes = Array.from(document.querySelectorAll("div.style-scope.ytd-item-section-renderer#contents > ytd-video-renderer > div.style-scope.ytd-video-renderer#dismissable a#video-title")).map(vid => vid.href)
// boxes = Array.from(document.querySelectorAll("div.style-scope.ytd-item-section-renderer#contents > ytd-video-renderer > div.style-scope.ytd-video-renderer#dismissable"))
var scrollHeight = document.documentElement.scrollHeight
window.scrollBy(0, distance)
scrolledHeight += distance
if(scrolledHeight >= scrollHeight || boxes.length >= 50){
clearInterval(timer)
resolve(boxes)
}
}, 500)
})
})
console.log(videoURLs)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With