Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an alternative library to html5ever that takes a string and returns a queryable object? [closed]

Tags:

html

rust

I am trying to parse HTML in Rust. The one library that seems to do this is html5ever. I can't find any simple way to make it take a string and return a queryable object.

Is there an alternative library that I can use that takes a string and returns an object that I can query on?

I am trying to do something like web scraping here.

I am a complete Rust newbie.

like image 237
Vignesh Avatar asked Feb 13 '16 06:02

Vignesh


1 Answers

You can use the select crate, which is basically a wrapper over the html5ever, but gives a nicer api.

For example:

use select::document::Document;
use select::predicate::Name;

for i in Document::from_str(html_src_string).find(Name("article")).iter() {
    println!("{:?}",i.text() );       //prints text content of all articles
};

select.rs repository has more elaborate examples.

like image 104
creativcoder Avatar answered Nov 15 '22 11:11

creativcoder