Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying an array to Range.Value2 SafeArray.pvData successful, but Excel fails to update

Tags:

arrays

excel

vba

EDIT

Indeed, there is no direct way of editing the range values in memory. Thanks @AndASM for the detailed answer and Carl; good hunch, it was spot-on. I must have been too tangled in all the reversing at that moment, forgetting Value2 is just a property.

Meanwhile, I delved a bit deeper with some other tests and debugging with OllyDbg and found a few interesting things:

  1. Cells are arranged in 16 x 1024 areas. The structure holding the areas may very well be the worksheet, but I can't confirm yet;
  2. Each time the Value property is invoked, absolute sheet offsets (row, column) are used to find the corresponding area and then for some indexing inside the area to get the actual value;
  3. A 2D SAFEARRAY of type VARIANT is created;
  4. Values are not retrieved in a contiguous block, but individually. This means that each "point" in a range (row,col) is sent to the indexing procedure to return a value (variant, obviously) for its corresponding SAFEARRAY element;
  5. As a consequence of the above, each time you retrieve a value via Range.Value2(row,col), the whole process is reiterated for all the values in the range. Imagine the performance hit if you do this multiple times inside a procedure or, even worse, inside a loop. Just don't; you're better off creating a copy of Value2 and addressing it via indexing;
  6. Last, yet not least, the distribution of values inside SAFEARRAY.pvData is column-based (col,row), not row-based, which may be found counter-intuitive and at odds with the VBA indexing mode, which is (row,col). This may come in handy if you need accessing pvData directly in memory and keeping dimension coherence. As an example, a range like the one below

        1, 2, 3, 4  
        5, 6, 7, 8
    

    would be stored in pvData in the following order:

        1, 5, 2, 6, 3, 7, 4, 8
    

I hope the above helps. To sum up, in absence of any such exported function in Excel, the best way around is creating a copy of Value2, sorting / manipulating it towards the desired outcome and assigning it back to the range property.


I have recently finished a variation of QuickSort and intend to implement in for Excel. The algorithm is effective and would really bring value as an add-in, if not for the extra time spent on putting the array values into the range. Transposing only works for less than 65537, while the 'paste variant-array to range takes too long on large sorts.

So, I wrote a few procedures that would allow copying the 2D values from the range into a 1D array (1D is needed for sorting) and (after the sorting is done) putting them back, all based on SAFEARRAY and MemCopy(RtlMoveMemory) and, alternately, WriteProcessMemory.

All works well, as far as memory operations are concerned: - the range values are copied to the array (from one SafeArray.pvData to the other); - the array values (after running the sorting algo) are successfully copied to the Range.Value2 SafeArray.pvData.

Still, the range does not update, as it seems to flip back to the old values (more on that in the code below). Why would "Range.Value2 = SomeOther2dArray" work and not modifying the data directly in memory? I have a feeling I am missing something here. Is a formula sort / update needed, as well?

Here is the main procedure:

    Public Sub XLSORT_Array2()
    With Application
        screenUpdateState = .ScreenUpdating
        statusBarState = .DisplayStatusBar
        calcState = .Calculation
        eventsState = .EnableEvents

        .ScreenUpdating = False
        .DisplayStatusBar = False
        .Calculation = xlCalculationManual
        .EnableEvents = False
    End With

    Dim rngSort As Range
    Dim arrSort() As Variant
    Dim arrTemp As Variant
    Dim i As Long
    Dim dblTime As Double
    Dim dblInitTime As Double: dblInitTime = Timer

    Set rngSort = Selection

    If Not rngSort Is Nothing Then
        If rngSort.Cells.Count > 1 And rngSort.Areas.Count = 1 Then
            dblTime = Timer
            ReDim arrSort(1 To rngSort.Cells.Count)
            Debug.Print Timer - dblTime & vbTab & "(Redim)"

            'just testing Excel memory location
            'Debug.Print VarPtr(rngSort.Value2(1, 1))

            dblTime = Timer
            SA_Duplicate arrSort, rngSort.Value2
            Debug.Print Timer - dblTime & vbTab & "(Copy)"

            dblTime = Timer
            SORTVAR_QSWrapper arrSort, 1, rngSort.Cells.Count
            Debug.Print Timer - dblTime & vbTab & "(Sort)"

            'this would be the fastest method
            'variants are copied to memory
            'yet the range does not update with the new values
            SA_Duplicate rngSort.Value2, arrSort

            'dblTime = Timer
            'looping = too slow
            'For i = 1 To rngSort.Cells.Count
            '    rngSort.Cells(i).Value = arrSort(i)
            'Next

            'this works, but it's too slow, as well
            'If rngSort.Cells.Count > 65536 Then
            '    ReDim arrTemp(LBound(rngSort.Value2, 1) To UBound(rngSort.Value2, 1), LBound(rngSort.Value2, 2) To UBound(rngSort.Value2, 2))
            '    SA_Duplicate arrTemp, arrSort
            '    rngSort.Value2 = arrTemp
            'Else
            '    rngSort.Value2 = WorksheetFunction.Transpose(arrSort)
            '    Debug.Print "Transposed"
            'End If
            'Debug.Print Timer - dblTime & vbTab & "(Paste)"
        End If

    End If

    With Application
        .ScreenUpdating = screenUpdateState
        .DisplayStatusBar = statusBarState
        .Calculation = calcState
        .EnableEvents = eventsState
    End With

    Debug.Print VarPtr(rngSort.Value2(1, 1)) & vbTab & Mem_ReadHex(ByVal VarPtr(rngSort.Value2(1, 1)), rngSort.Cells.Count * 16)
    Set rngSort = Nothing
    Debug.Print Timer - dblInitTime & vbTab & "(Total Time)" & vbNewLine
End Sub

Let's say the values in the range are 4, 3, 2, and 1. Before SA_Duplicate arrSort, rngSort.Value2 the memory reads:

130836704   05000000 00000000 00000000 00001040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00000040 05000000 00000000 00000000 0000F03F 
129997032   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

where 130836704 is Range.Value2 SafeArray.pvData and 129997032 is SortArray SafeArray.pvData. Each 16-byte batch represents the variant actual data, as read from memory (no LE translation, just in hex), with the first 2 bytes indicating the VarType. In this case, vbDouble.

After the copy, as expected, the memory reads:

130836704   05000000 00000000 00000000 00001040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00000040 05000000 00000000 00000000 0000F03F 
129997032   05000000 00000000 00000000 00001040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00000040 05000000 00000000 00000000 0000F03F

After the sort is complete, the SortArray SafeArray.pvData reads:

129997032   05000000 00000000 00000000 0000F03F 05000000 00000000 00000000 00000040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00001040

After executing SA_Duplicate rngSort.Value2, arrSort, the memory shows that the Range.Value2 SafeArray.pvData has been updated:

129997032   05000000 00000000 00000000 0000F03F 05000000 00000000 00000000 00000040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00001040
130836704   05000000 00000000 00000000 0000F03F 05000000 00000000 00000000 00000040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00001040

All looking fine so far, except that Debug.Print VarPtr(rngSort.Value2(1, 1)) & vbTab & Mem_ReadHex[...] shows that the values flipped back to the initial order:

130836704   05000000 00000000 00000000 00001040 05000000 00000000 00000000 00000840 05000000 00000000 00000000 00000040 05000000 00000000 00000000 0000F03F

Please share any thoughts or methods you found effective. Any help is appreciated. It's frustrating having to wait for Excel about 4 seconds ( sorting 1,000,000 + cells), when even the most challenging sort takes less than that.

Thanks in advance!

like image 371
VBA For Anything Avatar asked Dec 01 '15 22:12

VBA For Anything


1 Answers

Well, you didn't provide a working implementation of several vital parts, especially SA_Duplicate, so this is mostly guess work. But, I think the answer is likely quite simple.

Range.Value2 is a property, not a variable. So behind the scenes it is actually two functions, lets call them Range.let_Value2 and Range.get_Value2.

That said, how do you expect the call SA_Duplicate rngSort.Value2, arrSort to work? Because what I see is SA_Duplicate rngSort.get_Value2, arrSort. I would assume that rngSort.get_Value2 is creating a new SafeArray, then copying the data from Excel's internal data structures into that SafeArray. If I'm correct, you're then writing your data to a temporary buffer that VBA later discards and Excel has already forgotten about.

You need to use rngSort.let_Value2 arrSort which is usually called as rngSort.Value2 = arrSort.

As a side note, if get_Value2 allocates a new array as I think it does, both the SA_Duplicate calls are unnecessary and you might be able to sort the returned array in place. Do remember to pass it to let_Value2 by assigning the array variable back to the property when you are done.

like image 166
AndASM Avatar answered Nov 02 '22 08:11

AndASM