Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excel VBA / HTML Clicking next page from dropdown

I'm writing my first data scraper using Excel and VBA. I'm stuck trying to go to the next page of a website. The source code looks as follows:

<li><a href="#" onclick="changePage(2); return false;">Page 2 of 24</a></li>

This is the VBA code I have but does not seem to work:

For Each l In ie.Document.getElementsByTagName("a")
    If l.href = "#" And l.onclick = "changePage(2); return false;" Then
        l.Item(2).Click
        Exit For
    End If
Next l

When I run the code I don't get any errors, but it doesn't seem to go to page 2. Keep in mind that there are more pages after page 2. My idea is replace "2" with a variable later and increase that variable by one. But I need to get it to work first.

Thanks to whoever can help.

like image 973
Vicman Avatar asked Feb 18 '16 23:02

Vicman


1 Answers

[Edit: I now have a solution and the code has been replaced. -RDH]

First I want to mention that if the data retrieved in this manner is used for commercial purposes or anything other than personal use then it violates 2 sections of the Kelley Blue Book (kbb.com) Terms of Service.

FYI: Sites that collect, update, and maintain data like BlueBook or the MLS take their data very seriously, and they don't like people scraping it. I was speaking to an old classmate of mine who has her degree in Computer Science and is now a real estate agent, and I mentioned to her about how cool it is to be able scrape housing data off of MLS and she nearly flipped out on me. Just saying: people were paid to create that data and people make their lives using that data. 'Nuff said. I was able to get the problem code running by creating a web page on my own server that had the same format you were looking for since I get a different version of the bluebook.com site since I am in Canada. I get redirected to kbb.com.

+++ The real problem +++

The problem is that hrefs with an # symbol are actually the full URL with the # attached to the end, and when you check the onClick event it actually contains the full function declariation, so you have to only search for partial strings.

' A good idea to declare the proper datatypes
' because IHTMLElement has the click event but IHTMLAnchorElements don't
Dim l As IHTMLElement
Dim htmlanchors As IHTMLElementCollection
' ...

Set htmlanchors = ie.Document.getElementsByTagName("a")

' Look through all the anchor tags on the page
    For Each l In htmlanchors
       ' Check to see the Href contains a # and the onclick event has specific code
        If InStr(l.href, "#") And InStr(l.onclick, "changePage(3); return false;") Then
            ' Click the current anchor link
            l.Click
            Exit For
        End If
Next l
like image 124
Rick Henderson Avatar answered Oct 31 '22 06:10

Rick Henderson