Well not really RANDBETWEEN()
. I'm trying to create a UDF to return the index of a number within an array, where the larger the number the more likely it is to be chosen.
I know how to assign probabilities to random numbers in a worksheet (i.e. using MATCH()
on the sum of the probabilities, there's plenty of stuff on SO explaining that), but I want a UDF because I'm passing a special input array into the function - not just a selected range.
My issue is, the weighting is off, numbers later on in the array are more likely to be returned than those earlier in the array and I can't see where in my code I've gone wrong. Here's the UDF so far:
Public Function PROBABLE(ParamArray inputArray() As Variant) As Long
'Takes a set of relative or absolute probabilities and ranks a random number within them
Application.Volatile (True)
Dim outputArray() As Variant
Dim scalar As Single
Dim rankNum As Single
Dim runningTot As Single
'''''
'Here I take inputArray() and convert to outputArray(),
'which is fed into the probability code below
'''''
scalar = 1 / WorksheetFunction.Sum(outputArray)
rankNum = Rnd()
runningTot = 0
For i = 0 To UBound(outputArray)
runningTot = runningTot + outputArray(i)
If runningTot * scalar >= rankNum Then
PROBABLE = i + 1
Exit Function
End If
Next i
End Function
The function should look at the relative sizes of the numbers in outputArray()
and pick randomly but weighted towards the larger numbers.
E.g. outputArray()
of {1,0,0,1}
should assign probabilities respectively of {50%,0%,0%,50%}
However when I tested that outputArray()
, for 1000 samples and 100 iterations, and graphed how frequently item 1 or item 4 in the array was returned, I got this result:
Approximately 20%:80% distribution. Plotting {1,1,1,1}
(all should have equal chance) gave a 10%:20%:30%:40% distribution
I know I'm missing something obvious but I can't tell what, any help?
Some people were asking for the complete code, here it is.
Public Function PROBABLE(ParamArray inputArray() As Variant) As Long
'Takes a set of relative or absolute probabilities and ranks a random number within them
Application.Volatile (True) 'added some dimensions up here
Dim outputArray() As Variant
Dim inElement As Variant
Dim subcell As Variant
Dim scalar As Single
Dim rankNum As Single
Dim runningTot As Single
'convert ranges to values
'creating a new array from the mixture of ranges and values in the input array
''''
'This is where I create outputArray() from inputArray()
''''
ReDim outputArray(0)
For Each inElement In inputArray
'Normal values get copied from the input UDF to an output array, ranges get split up then appended
If TypeName(inElement) = "Range" Or TypeName(inElement) = "Variant()" Then
For Each subcell In inElement
outputArray(UBound(outputArray)) = subcell
ReDim Preserve outputArray(UBound(outputArray) + 1)
Next subcell
'Stick the element on the end of an output array
Else
outputArray(UBound(outputArray)) = inElement
ReDim Preserve outputArray(UBound(outputArray) + 1)
End If
Next inElement
ReDim Preserve outputArray(UBound(outputArray) - 1)
''''
'End of new code, the rest is as before
''''
scalar = 1 / WorksheetFunction.Sum(outputArray)
rankNum = Rnd()
runningTot = 0
For i = 0 To UBound(outputArray)
runningTot = runningTot + outputArray(i)
If runningTot * scalar >= rankNum Then
PROBABLE = i + 1
Exit Function
End If
Next i
End Function
The start inputArray()
🡒 outputArray()
section is used to standardise different input methods. I.e. the user can enter a mixture of values, cell references/ranges and arrays, and the function can cope. e.g. {=PROBABLE(A1,5,B1:C15,IF(ISTEXT(D1:D3),LEN(D1:D3),0))}
(you get the picture) should work just as well as =PROBABLE(A1:A3)
. I cycle through the sub-elements of the inputArray() and put them in my outputArray(). I'm fairly certain there's nothing wrong with this portion of code.
Then to get my results, I copied the UDF into A1:A1000
, used a COUNTIF(A1:A1000,1)
or instead of count 1, I did count 2, 3, 4 etc for each of the possible UDF outputs and made a short macro to recalculate the sheet 100 times, each time copying the result of the countif into a table to graph. I can't say precisely how I did that because I left this all at work, but I'll update on Monday.
Try this:
Function Probable(v As Variant) As Long
Application.Volatile 'remove this if you don't want a volatile function
Dim v2 As Variant
ReDim v2(LBound(v) To UBound(v) + 1)
v2(LBound(v2)) = 0
Dim i As Integer
For i = LBound(v) To UBound(v)
v2(i + 1) = v2(i) + v(i) / Application.Sum(v)
Next i
Probable = Application.WorksheetFunction.Match(Rnd(), v2, 1)
End Function
The array v
is essentially your outputArray
.
The code takes an array like {1,0,0,1}
and converts it to {0,0.5,0.5,1}
(note the 0
at the beginning) at which point you can do a MATCH
as you suggested to get to get either a 1 or 4
with equal probability.
Similarly, if you were to start with {1,1,1,1}
it would be converted to {0,0.25,0.5,0.75,1}
and return any of 1, 2, 3 or 4
with equal probability.
Also note: you could probably make it a bit quicker if you save the value of Application.Sum(v)
in a variable rather than performing the calculation for every value in array v
.
Update
The function now takesv
as a parameter -- like your code. I also tweaked it a bit so that it can deal with v
having any base, which means you can run it from the worksheet too: =Probable({1,0,0,1})
for example
It appears I have made a tragic mistake. My code was fine, my counting wasn't so good. I was using SUMIF()
instead of COUNTIF()
in my graphing, resulting in later objects in the array (with a higher Index - the output of the UDF which I was supposed to be counting but was instead summing) getting a weighting proportional to their position.
In retrospect, I think someone far more clever than I could probably have deduced that from the information given. I said {1,1,1,1}
has a {10%:20%:30%:40%}
,that's a {1:2:3:4} ratio, which is precisely the same ratio as the indices of the outputs, deduction: the outputs were summed not counted.
Similarly, the graph of {1,0,0,1}
with a {20%:0%:0%:80%}
output, well divide each percentage by it's index (20%/1, 80%/4) and Hey Presto {20%:0%:0%:20%}
, or the 1:1 ratio I had expected.
Something annoying but satisfying in that - knowing the answer was there all along. I suppose there's probably a moral in all this. At least the post can serve as a warning to budding VBAers to check their arithmetic.
This is something I have built, following your logic. It works quite ok, providing different results.
Option Explicit
Public Function TryMyRandom() As String
Dim lngTotalChances As Long
Dim i As Long
Dim previousValue As Long
Dim rnd As Long
Dim result As Variant
Dim varLngInputArray As Variant
Dim varLngInputChances As Variant
Dim varLngChancesReedit As Variant
varLngInputChances = Array(1, 2, 3, 4, 5)
varLngInputArray = Array("a", "b", "c", "d", "e")
lngTotalChances = Application.WorksheetFunction.Sum(varLngInputChances)
rnd = Application.WorksheetFunction.RandBetween(1, lngTotalChances)
ReDim varLngChancesReedit(UBound(varLngInputChances))
For i = LBound(varLngInputChances) To UBound(varLngInputChances)
varLngChancesReedit(i) = varLngInputChances(i) + previousValue
previousValue = varLngChancesReedit(i)
If rnd <= varLngChancesReedit(i) Then
result = varLngInputArray(i)
Exit For
End If
Next i
TryMyRandom = result
End Function
Public Sub TestMe()
Dim lng As Long
Dim i As Long
Dim dict As Object
Dim key As Variant
Dim res As String
Set dict = CreateObject("Scripting.Dictionary")
For lng = 1 To 1000
res = TryMyRandom
If dict.Exists(res) Then
dict(res) = dict(res) + 1
Else
dict(res) = 1
End If
Next lng
For Each key In dict.Keys
Debug.Print key & " ===> " & dict(key)
Next
End Sub
Concerning your case, make sure that the array is sorted. E.g., in my case speaking about varLngInputChances
. I have not taken a look at the corner cases, there can be an error there, possibly.
Run the TestMe
sub. It will generate even a summary of the results.
If you change the variations to varLngInputChances = Array(1, 1, 0, 0, 1)
, it gives:
a ===> 329
b ===> 351
e ===> 320
which is quite good random :) You can change the number of the sample here:
For lng = 1 To 1000
, it works quite fast. I have just tried it with 100,000 tests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With