I have a spreadsheet that I've been working on for over a month to sort and optimize coordinates (sometimes exceeding 100,000 rows) and it is UNBEARABLY slow once I start importing files over 5,000 rows (it has taken several hours to complete the calculations and sorting process on data sets over 25,000 rows). The processing time grows exponentially with the number of coordinates imported. I've researched Stack Overflow to help me with some of the code and included a few safety nets for error handling and exiting the sub if there is no data.
The bit of code I'm using to actually sort the coordinates to find nearest neighbor coordinates and that I need help with is under the remark
' Sort coordinates in Point List Data looking for shortest distance between points
, located at approximately line 58 of 109 in my code below.
Simple quadrant coordinates (X, Y and Z) are in columns H, I , and J respectively, starting at row 6. The named range is PosXYZ and the formula for this named range is:
=INDEX(Optimizer!$H:$H, ROW(Optimizer!$H$5) + 1):INDEX(Optimizer!$L:$L, MATCH(bignum, Optimizer!$I:$I)).
bignum is defined as =1E+307*17.9769313486231
.
Column K
is populated with the Pythagorean Theorem to calculate the distance between the current data point X,Y and the previous data point X,Y in the list.
Column L
is populated with a list of sequential row numbers created when the data is imported so that the original sort order of the data can be restored using a separate piece of VBA code.
I'm trying to see if using an array would greatly speed up the time it takes to run this point list optimizer, and I'm hoping that someone might be able to help me figure out how to get this portion of my code to run exponentially faster.
I found the following similar question, and I'm wondering if this approach is something that I could use to help speed up my processing time: How do you speed up VBA code with a named range?
I've learned a lot from this site, and I'm hoping that someone has the patience and the knowledge to help me figure this one out. I don't have a lot of experience using arrays in VBA.
Sample Excel file with 2904 data points and VBA code can be found here.
Sub Optimize_PL()
' Add an error handler
On Error GoTo ErrorHandler
' Speed up sub-routine by turning off screen updating and auto calculating until the end of the sub-routine
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' Declare variable names and types
Dim rInp As Range
Dim rTmp As Range
Dim i As Long
Dim n As Long
Dim sFrm As String
Dim PosX As String
Dim PosY As String
Dim PosZ As String
Dim SortOrder As String
Dim LastRow As Long
Dim hLastRow As Long
Dim lLastRow As Long
' Find number of populated cells in Column H and Column L (not including the 5 column header rows)
hLastRow = Cells(Rows.Count, "H").End(xlUp).Row - 5
lLastRow = Cells(Rows.Count, "L").End(xlUp).Row - 5
' Check for existing Point List Data to avoid error
If hLastRow < 2 Then
MsgBox "Not enough data points are available to optimize." & vbNewLine & _
"" & vbNewLine & _
"Column H populated rows: " & hLastRow, vbInformation, "Error Message"
GoTo ErrorHandler
ElseIf lLastRow < 2 Then
MsgBox "Original sort order row numbers not available in Column L," & vbNewLine & _
"" & vbNewLine & _
"Original sort order canot be restored without Row # data." & vbNewLine & _
"Column L populated rows: " & lLastRow, vbInformation, "Error Message"
Err.Number = 0
GoTo ErrorHandler
ElseIf hLastRow <> lLastRow Then
MsgBox "The number of rows of coordinate data does not match the" & vbNewLine & _
"number of rows in the Row # column. There is no way to" & vbNewLine & _
"restore the original sort order." & vbNewLine & _
"" & vbNewLine & _
"Column H populated rows: " & hLastRow & vbNewLine & _
"Column L populated rows: " & lLastRow, vbInformation, "Error Message"
Err.Number = 0
GoTo ErrorHandler
End If
' Timer Start (calculate the length of time this VBA code takes to complete)
StartTime = Timer
' Sort coordinates in Point List Data looking for shortest distance between points
Set rInp = Range("PosXYZ").Resize(, 4)
n = rInp.Rows.Count
i = 0
For i = 1 To n - 1
Application.StatusBar = i + 1 & " of " & n & " Calculating for " & SecondsElapsed & " seconds" & " Estimated Time Remaining: " & TimeRemaining & " seconds"
SecondsElapsed = Round(Timer - StartTime) ' Change to StartTime, 2) to display seconds two decimal places out
TimeRemaining = Round((SecondsElapsed / (i + 1)) * (n - (i + 1))) ' Change to i + 1)),2) to display seconds two decimal places out
Set rTmp = rInp.Offset(i).Resize(n - i, 5)
With rTmp
PosX = .Cells(0, 1).Address(ReferenceStyle:=xlR1C1)
PosY = .Cells(0, 2).Address(ReferenceStyle:=xlR1C1)
PosZ = .Cells(0, 3).Address(ReferenceStyle:=xlR1C1)
SortOrder = .Cells(0, 5).Address(ReferenceStyle:=xlR1C1)
sFrm = Replace(Replace(Replace(Replace("=SQRT((RC[-3] - PosX)^2 + (RC[-2] - PosY)^2)", "PosX", PosX), "PosY", PosY), "PosZ", PosZ), "SortOrder", SortOrder)
sFrm = Replace(Replace(Replace(Replace(sFrm, "PosX", PosX), "PosY", PosY), "PosZ", PosZ), "SortOrder", SortOrder)
.Columns(4).FormulaR1C1 = sFrm
.Sort Key1:=.Range("D1"), Header:=xlNo
End With
Next i
' Timer Stop (calculate the length of time this VBA code took to complete)
SecondsElapsed = Round(Timer - StartTime, 2)
' Turn screen updating and auto calculating back on since file processing is now complete
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
' Message to report VBA code processing time after file selection and number of data rows imported
MsgBox "Calculated optimized travel path between coordinates in " & vbNewLine & _
"" & vbNewLine & _
" " & SecondsElapsed & " seconds"
' Reset to defaults in the event of a processing error during the sub-routine execution
ErrorHandler:
Application.StatusBar = True
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
If Err.Number <> 0 Then
' Display a message to the user including the error code in the event of an error during execution
MsgBox "An error number " & Err.Number & " was encountered!" & vbNewLine & _
"Part or all of this VBA code was not completed.", vbInformation, "Error Message"
End If
End Sub
Yes you can speed up this code a lot using arrays: the code below is approx 20 times faster.
Sub Optimize_PL2()
' Add an error handler
On Error GoTo ErrorHandler
' Speed up sub-routine by turning off screen updating and auto calculating until the end of the sub-routine
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' Define variable names and types
Dim i As Long
Dim j As Long
Dim k As Long
Dim hLastRow As Long
Dim lLastRow As Long
Dim varData As Variant
Dim dData() As Double
Dim dResult() As Double
Dim jRow() As Long
Dim dThisDist As Double
Dim dSmallDist As Double
Dim jSmallRow As Long
' Find number of populated cells in Column H and Column L (not including the 5 column header rows)
hLastRow = Cells(Rows.Count, "H").End(xlUp).Row - 5
lLastRow = Cells(Rows.Count, "L").End(xlUp).Row - 5
' Check for existing Point List Data to avoid error
If hLastRow < 2 Then
MsgBox "Not enough data points are available to optimize." & vbNewLine & _
"" & vbNewLine & _
"Column H populated rows: " & hLastRow, vbInformation, "Error Message"
GoTo ErrorHandler
ElseIf lLastRow < 2 Then
MsgBox "Original sort order row numbers not available in Column L," & vbNewLine & _
"" & vbNewLine & _
"Original sort order canot be restored without Row # data." & vbNewLine & _
"Column L populated rows: " & lLastRow, vbInformation, "Error Message"
Err.Number = 0
GoTo ErrorHandler
ElseIf hLastRow <> lLastRow Then
MsgBox "The number of rows of coordinate data does not match the" & vbNewLine & _
"number of rows in the Row # column. There is no way to" & vbNewLine & _
"restore the original sort order." & vbNewLine & _
"" & vbNewLine & _
"Column H populated rows: " & hLastRow & vbNewLine & _
"Column L populated rows: " & lLastRow, vbInformation, "Error Message"
Err.Number = 0
GoTo ErrorHandler
End If
On Error GoTo 0
' Timer Start (calculate the length of time this VBA code takes to complete)
StartTime = Timer
varData = Worksheets("Optimizer").Range("H6").Resize(hLastRow, 5).Value2
ReDim dResult(1 To hLastRow, 1 To 5) As Double
ReDim dData(1 To hLastRow, 1 To 5) As Double
'
' copy vardata into data coercing to double
' (repeated arithmetic is faster on doubles than variants)
'
For j = LBound(varData) To UBound(varData)
For k = LBound(varData, 2) To UBound(varData, 2)
dData(j, k) = CDbl(varData(j, k))
If j = 1 Then
dResult(j, k) = dData(j, k)
End If
Next k
Next j
'
' look for shortest distance row
'
For i = LBound(dResult) To UBound(dResult) - 1
'
' calc distance from this row to all remaining rows and find shortest
'
jSmallRow = -1
dSmallDist = 1 * 10 ^ 307
For j = 2 To UBound(dData)
If dData(j, 3) > -1 And j <> i Then
dThisDist = Sqr((dResult(i, 1) - dData(j, 1)) ^ 2 + (dResult(i, 2) - dData(j, 2)) ^ 2)
If dThisDist < dSmallDist Then
jSmallRow = j
dSmallDist = dThisDist
End If
End If
Next j
'
' copy jsmallrow row to i+1
'
If jSmallRow > -1 Then
For k = 1 To 2
dResult(i + 1, k) = dData(jSmallRow, k)
Next k
dResult(i + 1, 4) = dSmallDist
dResult(i + 1, 5) = jSmallRow
'
' set smallrow so it does not get used again
'
dData(jSmallRow, 3) = -1
End If
Next i
'
' put data back on sheet
'
Worksheets("Optimizer").Range("H6").Resize(hLastRow, 5).Value2 = dResult
' Timer Stop (calculate the length of time this VBA code took to complete)
SecondsElapsed = Round(Timer - StartTime, 2)
' Turn screen updating and auto calculating back on since file processing is now complete
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
' Message to report VBA code processing time after file selection and number of data rows imported
MsgBox "Calculated optimized travel path between coordinates in " & vbNewLine & _
"" & vbNewLine & _
" " & SecondsElapsed & " seconds"
' Reset to defaults in the event of a processing error during the sub-routine execution
ErrorHandler:
Application.StatusBar = True
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
If Err.Number <> 0 Then
' Display a message to the user including the error code in the event of an error during execution
MsgBox "An error number " & Err.Number & " was encountered!" & vbNewLine & _
"Part or all of this VBA code was not completed.", vbInformation, "Error Message"
End If
End Sub
Aside from switching to a different algorihm (e.g. the k-d tree), here are a few things that will speed up the code:
Updated code:
Const HeaderRow = 5
Set rInp = Range(Cells(HeaderRow + 1, 8), Cells(hLastRow, 11))
n = rInp.Rows.Count
For i = 1 To n - 1
If i Mod 100 = 0 Then
Application.StatusBar = i + 1 & " of " & n & " Calculating for " & SecondsElapsed & " seconds" & " Estimated Time Remaining: " & TimeRemaining & " seconds"
SecondsElapsed = Round(Timer - StartTime) ' Change to StartTime, 2) to display seconds two decimal places out
TimeRemaining = Round((SecondsElapsed / (i + 1)) * (n - (i + 1))) ' Change to i + 1)),2) to display seconds two decimal places out
End If
Set rTmp = rInp.Offset(i).Resize(n - i, 5)
With rTmp
Dim TargetRow As Long
TargetRow = HeaderRow + i
sFrm = "=SQRT((RC[-3] - R" & TargetRow & "C[-3])^2 + (RC[-2] - R" & TargetRow & "C[-2])^2)"
With .Columns(4)
.FormulaR1C1 = sFrm
.Calculate
.Value = .Value
End With
.Sort Key1:=.Range("D1"), Header:=xlNo
End With
Next i
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With