Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering 2D Arrays in Excel VBA

Using Excel and VBA, I wanted some advice on how to best filter data in an array (in the same sort of way one might use a pivot table) strictly using VBA. I am creating a UserForm that is going to make some data decisions based on currently existing data. I can visualize how to do it well enough but am not that versed in VBA programming.

Here is an example

A       B       C
bob     12      Small
sam     16      Large
sally   1346    Large
sam     13      Small
sally   65      Medium
bob     1       Medium

To grab the data in an Array, I could use

Dim my_array As Variant

my_array = Range("A1").CurrentRegion

Now, I am familiar with looping through 2D arrays, but I wondered: what the most effective way to filter 2D array data (without looping through the array time and again)?

For example, how do I get would be to say get this kind of data:

data_for_sally As Variant 'rows with sally as name in ColA
data_for_sally_less_than_ten As Variant ' all rows with sally's name in ColA and colB < 10
data_for_all_mediums as Variant ' all rows where ColC is Medium

Suggestions? I could work this out with a bunch of custom functions and loops but I thought there must be a better way. Thanks.

like image 655
thornomad Avatar asked May 04 '12 14:05

thornomad


2 Answers

I assume you want to use VBA only.

I think it depends on several parameters, mainly on:

  • how often you run the same condition => do you store the result of a filter or do you recalculate every time?
  • how often you need to filter stuff => if often, it is worth having a proper code structure in place, if not then a one off loop is clearly the way to go.

From an OO perspective, assuming performance (speed & memory) is not an issue, I would go for the following design (I won't go into the details of the implementation, only give the general idea). Create a class (let's call it imaginatively ArrayFilter) that you could use like this.

Setup the filter

Dim filter As New ArrayFilter
With filter
    .name = "sam"
    .category = "Medium"
    .maxValue = 10
End With

Or

filter.add(1, "sam") 'column 1
filter.add(3, "Medium") 'column 3
filter.addMax(2, 10) 'column 2

Create the filtered data set

filteredArray = getFilteredArray(originalArray, filter)

The getFilteredArray is fairly straightforward to write: you loop over the array checking if the values match the filter and put the valid lines in a new array:

If filter.isValidLine(originalArray, lineNumber) Then 'append to new array

Pros

  • Clean design
  • Reusable, especially with the second version where you use the column number. This can be used to filter any arrays really.
  • Filtering code is in one function that you can test
  • Corollary: avoid duplication of code

Cons

  • Filtering is recalculated every time, even if you use the same filter twice. You can store the results in a Dictionary for example - see below.
  • Memory: every call to the getFilteredArray creates a new array, but not sure how this can be avoided anyway
  • This adds quite a few lines of code, so I would do it only if it helps make the code easier to read / maintain.

ps: If you need to cache the results to improve performance, one way would be to store the results in a dictionary and add some logic to the getFilteredArray function. Note that unless your arrays are really big and/or you run the same filter a lot, this is probably not worth it.

filters.add filter, filteredArray 'filters is a dictionary

That way, when you call getFilteredArray next time, you can do something like this:

For each f in filters
    'Check if all conditions in f and newFilter are the same
    'If they are:
    getFilteredArray = filters(f)
    Exit Function
Next

'Not found in cache: compute the result
like image 75
assylias Avatar answered Sep 28 '22 19:09

assylias


Try this

' credited to ndu
Function Filter2DArray(ByVal sArray, ByVal ColIndex As Long, ByVal FindStr As String, ByVal HasTitle As Boolean)
  Dim tmpArr, i As Long, j As Long, Arr, Dic, TmpStr, Tmp, Chk As Boolean, TmpVal As Double
  On Error Resume Next
  Set Dic = CreateObject("Scripting.Dictionary")
  tmpArr = sArray
  ColIndex = ColIndex + LBound(tmpArr, 2) - 1
  Chk = (InStr("><=", Left(FindStr, 1)) > 0)
  For i = LBound(tmpArr, 1) - HasTitle To UBound(tmpArr, 1)
    If Chk Then
      TmpVal = CDbl(tmpArr(i, ColIndex))
      If Evaluate(TmpVal & FindStr) Then Dic.Add i, ""
    Else
      If UCase(tmpArr(i, ColIndex)) Like UCase(FindStr) Then Dic.Add i, "" 'This finds only exact matches, if you need *FindStr* use:  If UCase(tmpArr(i, ColIndex)) Like UCase("*" & FindStr & "*") Then Dic.Add i, ""

    End If
  Next
  If Dic.Count > 0 Then
    Tmp = Dic.Keys
    ReDim Arr(LBound(tmpArr, 1) To UBound(Tmp) + LBound(tmpArr, 1) - HasTitle, LBound(tmpArr, 2) To UBound(tmpArr, 2))
    For i = LBound(tmpArr, 1) - HasTitle To UBound(Tmp) + LBound(tmpArr, 1) - HasTitle
      For j = LBound(tmpArr, 2) To UBound(tmpArr, 2)
        Arr(i, j) = tmpArr(Tmp(i - LBound(tmpArr, 1) + HasTitle), j)
      Next
    Next
    If HasTitle Then
      For j = LBound(tmpArr, 2) To UBound(tmpArr, 2)
        Arr(LBound(tmpArr, 1), j) = tmpArr(LBound(tmpArr, 1), j)
      Next
    End If
  End If
  Filter2DArray = Arr
End Function
like image 37
Đức Thanh Nguyễn Avatar answered Sep 28 '22 17:09

Đức Thanh Nguyễn