Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Html Agility Pack - how to select correct span class

I'm trying to find lowest price on Amazon pages. Let's use this url as an example:

http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=9963BB#/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=E999-4701&rh=i%3Aaps%2Ck%3AE999-4701

I want to find the lowest price ... the number to the right of "new from".

Here's what I have tried:

        using (TextWriter tw = new StreamWriter(@"D:\AmazonUrls.txt"))
        {
            foreach (string item in list)
            {
                var webGet = new HtmlWeb();
                var document = webGet.Load(item);
                var lowestPrice = document.DocumentNode.SelectSingleNode("//span[@id='subPrice']");
                if (lowestPrice != null)
                {
                    Console.WriteLine(lowestPrice);                
                }

            }           
        }

I'm not getting any result. Where am I going wrong?

like image 774
Ben Walker Avatar asked Mar 08 '26 19:03

Ben Walker


1 Answers

You are asking for nodes with an id of subPrice, but it is in fact class that has subPrice:

<span class="subPrice">
        <a href="http://www.amazon.com/gp/offer-listing/B001BA0W06/ref=sr_1_6_olp?ie=UTF8&qid=1334090832&sr=8-6&condition=new">5 new</a>
    from <span class="price">$245.90</span></span>

so,

var lowestPrice = document.DocumentNode.SelectSingleNode("//span[@class='subPrice']");

should get you what you want. However, the example page that you give has several nodes that match that pattern, so you problem want to select multiple nodes and then loop through them to decide which has the lowest privce.

like image 150
Adam Crossland Avatar answered Mar 11 '26 09:03

Adam Crossland