Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excel crashes when attempting to inspect DispStaticNodeList

Situation:

I am attempting to inspect a variable a, showing as a DispStaticNodeList object in the locals window; Excel crashes every time I try to do this.

Here is the variable a, apparently of type DispStaticNodeList, in the locals window:

Locals window

Reproducing the Excel crash:

  1. Attempting to expand the item in the Locals Window causes Excel to crash.
  2. Attempting to loop over it with a For Each also causes a crash.*TestFail

Research highlights:

  1. I did some digging around, searching for combinations such as Excel + Crash + DispStaticNodeList yielded zero results; At least with the Google search terms I used. Pretty sure my Google-Fu was weak.
  2. If I believe this article I am dealing with a COM object that is supported by MSHTML.

  3. And according to this:

If the name is DispStaticNodeList, we can be pretty sure it's an array..(or at least has array semantics).

Based on point 3 I wrote the code, TestPass, below, which does loop over it successfully, but I don't fully understand why. I have set an object and then looped its len!

  1. I have just found this which states:

NodeList objects are collections of nodes such as those returned by properties such as Node.childNodes and the document.querySelectorAll() method.

So it seems the object may be a NodeList, which given the description in the immediate window seems about right, and as a list I can loop over its length, but not sure why For Each won't work and why Excel crashes. Colleague suggests it may crash due to hierarchical nature of data. I further note that there are classes called IDOMNodeIterator and NodeIterator, but I am not sure if I can use those in line with the descriptions for NodeList methods here.

Question:

What is a and why does it cause Excel to crash when attempting to inspect or loop with a For Each?

Code that loops successfully:

Option Explicit

Public Sub TestPass()
    Dim html As HTMLDocument
    Set html = GetTestHTML
    Dim a As Object, b As Object

    Set a = html.querySelectorAll("div.intro p")

    Dim i As Long

    For i = 0 To Len(a) -1
        On Error Resume Next
        Debug.Print a(i).innerText    '<== HTMLParaElement
        On Error GoTo 0
    Next i
End Sub

Public Function GetTestHTML(Optional ByVal url As String = "https://www.w3schools.com/cssref/trysel.asp") As HTMLDocument
    Dim http As New XMLHTTP60
    Dim html As New HTMLDocument
    With http                                    'Set http = CreateObject("MSXML2.XMLHttp60")
        .Open "GET", url, False
        .send
        html.body.innerHTML = .responseText
        Set GetTestHTML = html
    End With
End Function

*TestFail Code that causes crash:

Public Sub TestFail()
    Dim html As HTMLDocument
    Set html = GetTestHTML
    Dim a As Object, b As Object

    Set a = html.querySelectorAll("div.intro p")

    For Each b In a

    Next b
End Sub

Notes:

I sent a test workbook to a colleague who was also able to reproduce this behaviour with the example given.

Project References:

Project references

Sample of HTML (link was also provided)

<div class="noSel">
<h1 style=""><span class="markup">&lt;h1&gt;</span>Welcome to My Homepage<span class="markup">&lt;/h1&gt;</span></h1>

<div id="helpIntro" style="">
<span class="markup">&lt;div class="intro"&gt;</span>
<div class="intro">
<p style="margin-top: 4px; border-color: rgb(255, 102, 102); background-color: rgb(255, 255, 153);"><span class="markup">&lt;p&gt;</span>My name is Donald <span id="Lastname" style=""><span class="markup">&lt;span id="Lastname"&gt;</span>Duck.<span class="markup">&lt;/span&gt;</span></span><span class="markup">&lt;/p&gt;</span></p>
<p id="my-Address" style="border-color: rgb(255, 102, 102); background-color: rgb(255, 255, 153);"><span class="markup">&lt;p id="my-Address"&gt;</span>I live in Duckburg<span class="markup">&lt;/p&gt;</span></p>
<p style="margin-bottom: 4px; border-color: rgb(255, 102, 102); background-color: rgb(255, 255, 153);"><span class="markup">&lt;p&gt;</span>I have many friends:<span class="markup">&lt;/p&gt;</span></p>
</div>
<span class="markup">&lt;/div&gt;</span>
</div>

<br>
<div class="helpUl">
<span class="markup">&lt;ul id="Listfriends&gt;</span>
<ul id="Listfriends" style="margin-top:0px;margin-bottom:0px;">

<li><span class="markup">&lt;li&gt;</span>Goofy<span class="markup">&lt;/li&gt;</span></li>
<li><span class="markup">&lt;li&gt;</span>Mickey<span class="markup">&lt;/li&gt;</span></li>
<li><span class="markup">&lt;li&gt;</span>Daisy<span class="markup">&lt;/li&gt;</span></li>
<li><span class="markup">&lt;li&gt;</span>Pluto<span class="markup">&lt;/li&gt;</span></li>
</ul>       
<span class="markup">&lt;/ul&gt;</span>
</div>

<ul style="display:none;"></ul>
<p style=""><span class="markup">&lt;p&gt;</span>All my friends are great!<span class="markup">&lt;br&gt;</span><br>But I really like Daisy!!<span class="markup">&lt;/p&gt;</span></p>

<p lang="it" title="Hello beautiful" style=""><span class="markup">&lt;p lang="it" title="Hello beautiful"&gt;</span>Ciao bella<span class="markup">&lt;/p&gt;</span></p>

Edit: I have also been able to loop in the following fashion:

Public Sub Test()
    Dim html As MSHTML.HTMLDocument, i As Long
    Set html = GetTestHTML

    For i = 0 To html.querySelectorAll("div.intro p").Length - 1
       Debug.Print html.querySelectorAll("div.intro p")(i).innerText
    Next i

End Sub
like image 785
QHarr Avatar asked Oct 29 '22 13:10

QHarr


1 Answers

If the name is DispStaticNodeList, we can be pretty sure it's an array..(or at least has array semantics).

Arrays can normally be iterated with a For Each loop, however it's more efficient to iterate them using a For loop. Looks like what you're getting isn't exactly an array, and while it appears to support indexing, it apparently doesn't support enumeration, which would explain the blowing up when you attempt to enumerate it with a For Each loop.

Looks like the locals toolwindow might be using For Each semantics to list the items in the collection.

I'm not familiar with that particular library so this is a bit of (educated) guesswork, but it's pretty easy to make a custom COM collection type that can't be iterated with a For Each loop in VBA - normally the error is caught on the VBA side though... Seems there might be a bug in the library's enumerator implementation (assuming there's an enumerator for it) causing it to throw an exception that ends up unhandled and somehow takes everything down with it... thing is, you can't fix & recompile that library... so the only thing you can do is to avoid iterating that type with a For Each loop, and avoid expanding it in the locals toolwindow (and so, ...save your work often!).

This article gives a good idea from a C#/.NET perspective, of how COM enumeration works. Of course that library isn't managed code (.NET), but the COM concepts at play are the same.

TL;DR: it's not because you can For...Next that you can For Each; the COM type involved must explicitly support enumeration. If the VBA code compiles with a For Each loop, then it does, so it must be a bug in the type's enumerator.

like image 118
Mathieu Guindon Avatar answered Nov 12 '22 13:11

Mathieu Guindon