Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to automate and export HTML searchable query data into excel [closed]

A webpage I am interested in extracting data from has a table with multiple search fields. I can enter data into any of these fields and click the search button at the bottom of the table and see the results based on the information I wanted to search for.

I have multiple numbers i want to search for (around 300), instead of searching each of these individually, is there a way to automate searching the data and import the data into an excel worksheet for each number I want to search?

is it possible using an Excel macro?

like image 909
user1821321 Avatar asked Nov 30 '25 12:11

user1821321


1 Answers

You can use the MSXML and MSHTML libraries for this. This code should get you started.
Start by running this sub to add both references (you only need to run this once):

Sub addReferences()
    ActiveWorkbook.VBProject.References.AddFromGuid "{3050F1C5-98B5-11CF-BB82-00AA00BDCE0B}", 4, 0
    ActiveWorkbook.VBProject.References.AddFromGuid "{F5078F18-C551-11D3-89B9-0000F81FE221}", 6, 0
End Sub

Then edit the getCAGEValues sub to import your CAGE codes and save the resulting data (and any additional data you want from the page):

Sub getCAGEValues()
    Dim oHTMLDoc As MSHTML.HTMLDocument
    Dim oSpan As MSHTML.HTMLGenericElement
    Dim CAGECodes() As Variant
    CAGECodes = Array("12345", "12346") 'CAGECodes is an array of your codes'
    For Each CAGECode In CAGECodes
        Set oHTMLDoc = getPage(CAGECode)
        Set oSpan = oHTMLDoc.getElementById("ctl00_cphMainPageBody_lblCompNameData") 'The id for the company name'
        MsgBox oSpan.innerText 'Save the value however you want to.'
    Next
End Sub

Function getPage(CAGECode As Variant) As MSHTML.HTMLDocument
    Dim oHttpRequest As MSXML2.XMLHTTP60
    Set oHttpRequest = New MSXML2.XMLHTTP60
    With oHttpRequest
        .Open "GET", "http://www.logisticsinformationservice.dla.mil/BINCS/details.aspx?CAGE=" & CAGECode, False
        .setRequestHeader "Cache-Control", "no-cache"
        .setRequestHeader "Pragma", "no-cache"
        .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
        .send
    End With
    Dim oHTMLDoc As MSHTML.HTMLDocument
    Set oHTMLDoc = New MSHTML.HTMLDocument
    oHTMLDoc.body.innerHTML = oHttpRequest.responseText
    Set getPage = oHTMLDoc
End Function
like image 199
Kevin Pope Avatar answered Dec 03 '25 04:12

Kevin Pope



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!