Say I have this sample data
A =
1.0000 6.0000 180.0000 12.0000
1.0000 5.9200 190.0000 11.0000
1.0000 5.5800 170.0000 12.0000
1.0000 5.9200 165.0000 10.0000
2.0000 5.0000 100.0000 6.0000
2.0000 5.5000 150.0000 8.0000
2.0000 5.4200 130.0000 7.0000
2.0000 5.7500 150.0000 9.0000
I wish to calculate the variance of each column, grouped by class (the first column).
I have this working with the following code, but it uses hard coded indices, requiring knowledge of the number of samples per class and they must be in specific order.
Is there a better way to do this?
variances = zeros(2,4);
variances = [1.0 var(A(1:4,2)), var(A(1:4,3)), var(A(1:4,4));
2.0 var(A(5:8,2)), var(A(5:8,3)), var(A(5:8,4))];
disp(variances);
1.0 3.5033e-02 1.2292e+02 9.1667e-01
2.0 9.7225e-02 5.5833e+02 1.6667e+00
Separate the class labels and the data into different variables.
cls = A(:, 1);
data = A(:, 2:end);
Get the list of class labels
labels = unique(cls);
Compute the variances
variances = zeros(length(labels), 3);
for i = 1:length(labels)
variances(i, :) = var(data(cls == labels(i), :)); % note the use of logical indexing
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With