Get title of a page with cheerio

Tags:

I'm trying to get the title tag of a url with cheerio. But, I'm getting empty string values. This is my code:

app.get('/scrape', function(req, res){

    url = 'http://nrabinowitz.github.io/pjscrape/';

    request(url, function(error, response, html){
        if(!error){
                        var $ = cheerio.load(html);

            var title, release, rating;
            var json = { title : "", release : "", rating : ""};

            $('title').filter(function(){
                //var data = $(this);
                var data = $(this);
                        title = data.children().first().text();            
                        release = data.children().last().children().text();

                json.title = title;
                json.release = release;
            })

            $('.star-box-giga-star').filter(function(){
                var data = $(this);
                rating = data.text();

                json.rating = rating;
            })
        }


        fs.writeFile('output.json', JSON.stringify(json, null, 4), function(err){

            console.log('File successfully written! - Check your project directory for the output.json file');

        })

        // Finally, we'll just send out a message to the browser reminding you that this app does not have a UI.
        res.send('Check your console!')
    })
});

431

asked Apr 27 '14 17:04

2 Answers

request(url, function (error, response, body) 
{
  if (!error && response.statusCode == 200) 
  {
    var $ = cheerio.load(body);
    var title = $("title").text();
  }
})

Using Javascript we extract the text contained within the "title" tags.

answered Oct 12 '22 22:10

Robert Ryan

If Robert Ryan's solution still doesn't work, I'd be suspicious of the formatting of the original page, which may be malformed somehow.

In my case I was accepting gzip and other compression but never decoding, so Cheerio was trying to parse compressed binary bits. When console logging the original body, I was able to spot the binary text instead of plain text HTML.

answered Oct 13 '22 00:10

David Calhoun

Related questions
                            
                                How to override a previously set jquery event handler?
                            
                                how to show image only when it is completely loaded?
                            
                                How to set focus on first field of input in BootStrap? [duplicate]
                            
                                Object doesn't support property or method 'append' in IE9
                            
                                passing php string with multiple lines to a javascript function/variable
                            
                                Ember.js REST Adapter without JSON root
                            
                                window.unload() won't work in jQuery
                            
                                continue ALLWAYS Illegal in switch in JS but break works fine
                            
                                Disable "No matches found" text and autocomplete on select2
                            
                                How to install Grunt
                            
                                Canvas getImageData() For optimal performance. To pull out all data or one at a time?
                            
                                Random 'email format' text using jQuery
                            
                                How can AngularJS factory return an object
                            
                                How can I delete a meteorite (atmosphere) package?
                            
                                Angular JS identify an digest complete event and removing # from url in angular js during viewchange
                            
                                TypeError: <Array>.each is not a function
                            
                                d3.js nvd3 date on x axis: only some dates are show
                            
                                Cannot focus web element to send keys
                            
                                create a calendar using CLNDR.js
                            
                                Convert JSON String to Object - jquery [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get title of a page with cheerio

Tags:

javascript

node.js

express

cheerio

Filipe Ferminiano

People also ask

2 Answers

Robert Ryan

David Calhoun

Recent Activity

Donate For Us