Firstly, I'll show a minimal example of my data and the code I have so far, so it will be easier to explain my issue.
Consider the following data:
ID Esp DBH Cod
55 E_grandis 9.00
55 E_grandis 9.71 7
55 E_grandis 10.00
55 E_grandis 1.00 7
55 E_grandis 7.00 7
55 E_grandis 1
I am trying to verify if rows with Cod = 7 have values greater than the:
average of DBH - 1 * standard deviation of DBH.
In the example above the average of DBH is 7.34 and the standard deviation is 3.73. Therefore, DBHs values should not be greater than 3.61 (7.34 - 3.73) when they are tagged Cod 7.
Cells D3 and D6 do not pass in the criteria because their DBHs (C3 and C6) are greater than 3.61. Among all rows with with Cod 7, only C5 is smaller than 3.61.
I wrote the code below which displays a message box when such criteria is not met:
Sub Cod7()
Dim msg As String 'msg box
Dim ID As Range
Dim dbh_stdev As Double 'standard deviation of dbh
Dim dbh_avg As Double 'average of dbh
Dim not_dominated As Double 'criteria threshold (upper bound)
Dim cell_i As Range 'initial of array to compute average and standard deviation
Dim cell_e As Range 'end of array to compute average and standard deviation
msg = ""
Set cell_i = Range("C2")
Set cell_e = Range("C7")
dbh_stdev = WorksheetFunction.StDev(Range(cell_i, cell_e)) 'dbh standard deviation
dbh_avg = WorksheetFunction.Average(Range(cell_i, cell_e)) 'dbh average
not_dominated = dbh_avg - dbh_stdev 'upper bound
'searches cells with cod 7 on column Cod, and it displays a message box if
'DBH is greater than the 'not_dominated' variable value
For Each ID In Range("A2", Range("A2").End(xlDown))
If ID.Offset(0, 3) = 7 And _
ID.Offset(0, 2) <> 0 And _
ID.Offset(0, 2) > not_dominated Then
msg = msg & "Cod 7 on " & ID.Offset(0, 3).Address() & " is incorrect" & vbLf
End If
Next ID
If Len(msg) > 0 Then MsgBox msg
End Sub
Now the problem is that in my real data I have more than one category under the column Esp (specie), and I need to evaluate the criteria, calculating the average and standard deviation of DBHs inside each group of specie.
The groups of species are clustered, i.e., one specie occurs through adjacent rows.
For example, this is a minimal data with two categories under the Esp column: E_grandis and E_citriodora.
ID Esp DBH Cod
55 E_grandis 9.00
55 E_grandis 9.71 7
55 E_grandis 10.00
55 E_grandis 1.00 7
55 E_grandis 7.00 7
55 E_grandis 1
55 E_citriodora 3.00
55 E_citriodora 2.00 7
55 E_citriodora 2.00 7
55 E_citriodora 1
55 E_citriodora 1
55 E_citriodora 0.50 7
The average of DBH in E_citriodora is 1.88 and the standard deviation is 1.03. Rows with Cod = 7 cannot have DBH greater than 0.85 (1.88-1.03). In this case, cells C9 e C10 don't pass the criteria and cell C13 passes.
How can I adapt the code to apply such criteria within groups of 'Esp'?
I believe the code below does what you want. Please note that this will only work if all species are "grouped" together.
I've added an external loop that allows the code to iterate through all rows that have data (specifically, that have a value in ID).
The initial value of the starting cell (cell_i
) is C2
as in the original code but I changed the way it computes the ending cell (cell_e
): it is now based on the number of rows in column B
that have the same value as cell_i
's current species (that's what the CountIf
is doing and that's why this only works if all species are clustered together).
That, together with the Set cell_i = cell_e.Offset(1)
, makes the loop jump from the 1st row of a species to the next one, etc.
For example, the 1st time this runs against your sample data, cell_i
will be C2
and cell_e
will be C7
because the number of rows that have E_grandis
in column B
is 6 which, subtracted of 1 and offset from the cell_i
's current row means it'll select a cell 5 rows down from the current one.
The 2nd time it'll start from C8
and go through C12
. Etc.
Inside the loop body, I've put your original code (mostly «unharmed»). I've just tweaked the For
loop so that it iterates through the cells in the range (cell_i
to cell_e
as captured in the groupRange
variable) instead of iterating between all rows that have a value in column A
.
I've added a couple of Select
calls so that you can follow the values of ccell
and groupRange
as you step through the code.
Option Explicit
Public Sub Cod7()
Dim msg As String 'msg box
Dim dbh_stdev As Double 'standard deviation of dbh
Dim dbh_avg As Double 'average of dbh
Dim not_dominated As Double 'criteria threshold (upper bound)
Dim cell_i As Range 'initial of array to compute average and standard deviation
Dim cell_e As Range 'end of array to compute average and standard deviation
Dim ccell As Range 'current cell
Dim groupRange As Range
msg = ""
Set cell_i = Range("C2")
Do While cell_i.Offset(, -2) <> ""
Set cell_e = cell_i.Offset(WorksheetFunction.CountIf(Range("B:B"), cell_i.Offset(, -1).Value) - 1)
Set groupRange = Range(cell_i, cell_e)
groupRange.Select
dbh_stdev = WorksheetFunction.StDev(groupRange) 'dbh standard deviation
dbh_avg = WorksheetFunction.Average(groupRange) 'dbh average
not_dominated = dbh_avg - dbh_stdev 'upper bound
'searches cells with cod 7 on column Cod, and it displays a message box if
'DBH is greater than the 'not_dominated' variable value
For Each ccell In groupRange
ccell.Select
If ccell.Offset(, 1).Value = 7 And _
ccell.Value <> 0 And _
ccell.Value > not_dominated Then
msg = msg & "Cod 7 on " & ccell.Offset(, 1).Address() & " is incorrect" & vbLf
End If
Next
Set cell_i = cell_e.Offset(1)
Loop
If Len(msg) > 0 Then MsgBox msg
End Sub
or you could do away with the whole script alltogether and use this formula in cell E2 (and then copy, paste down):
{=IF(AND(D2=7,C2>AVERAGEIF($B$2:$B$13,B2,$C$2:$C$13)-1* STDEV(IF($B$2:$B$13=B2,$C$2:$C$13))),"warning","")}
Notice the array formula - remember to confirm with ctrl-shift-enter
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With