I just downloaded the HTMLAgilityPack and the documentation doesn't have any examples. I'm looking for a way to download all the images from a website. The address strings, not the physical image. <pre class="prettyprint"><code><img src="blabalbalbal.jpeg" /> </code></pre> I need to pull the source of each img tag. I just want to get a feel for the library and what it can offer. Everyone said this was the best tool for the job. Edit <pre class="prettyprint"><code>public void GetAllImages() { WebClient x = new WebClient(); string source = x.DownloadString(@"http://www.google.com"); HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument(); document.Load(source); //I can't use the Descendants method. It doesn't appear. var ImageURLS = document.desc .Select(e => e.GetAttributeValue("src", null)) .Where(s => !String.IsNullOrEmpty(s)); } </code></pre>

You can do this using LINQ, like this: <pre class="prettyprint"><code>var document = new HtmlWeb().Load(url); var urls = document.DocumentNode.Descendants("img") .Select(e => e.GetAttributeValue("src", null)) .Where(s => !String.IsNullOrEmpty(s)); </code></pre> EDIT: This code now actually works; I had forgotten to write <code>document.DocumentNode</code>.

How can I use HTML Agility Pack to retrieve all the images from a website?

Tags:

I just downloaded the HTMLAgilityPack and the documentation doesn't have any examples.

I'm looking for a way to download all the images from a website. The address strings, not the physical image.

<img src="blabalbalbal.jpeg" />

I need to pull the source of each img tag. I just want to get a feel for the library and what it can offer. Everyone said this was the best tool for the job.

Edit

public void GetAllImages()     {         WebClient x = new WebClient();         string source = x.DownloadString(@"http://www.google.com");          HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();         document.Load(source);                           //I can't use the Descendants method. It doesn't appear.         var ImageURLS = document.desc                    .Select(e => e.GetAttributeValue("src", null))                    .Where(s => !String.IsNullOrEmpty(s));             }

286

asked Jan 21 '10 23:01

Sergio Tapia

1 Answers

You can do this using LINQ, like this:

var document = new HtmlWeb().Load(url); var urls = document.DocumentNode.Descendants("img")                                 .Select(e => e.GetAttributeValue("src", null))                                 .Where(s => !String.IsNullOrEmpty(s));

EDIT: This code now actually works; I had forgotten to write document.DocumentNode.

answered Sep 19 '22 15:09

SLaks

Related questions
                            
                                Generate a list from another list transforming each element on Groovy
                            
                                passing arguments to a dynamic form in django
                            
                                Searching a CSV File Using Grep
                            
                                JavaScript-like Object in Python standard library?
                            
                                xcodebuild - how to define preprocessor macro?
                            
                                How to store and echo multiple lines elegantly in bash? [duplicate]
                            
                                Throwing an AggregateException in my own code
                            
                                When did people first start thinking 'C is portable assembler'?
                            
                                How can I remove duplicates in an array but keep the same order?
                            
                                Flash Messages in Partials (Rails 3)
                            
                                What are your favorite global key bindings in emacs ? [closed]
                            
                                Very long class names

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With