My intention is to find its class through Bayes Classifier Algorithm.
Suppose, the following training data describes heights, weights, and feet-lengths of various sexes
SEX HEIGHT(feet) WEIGHT (lbs) FOOT-SIZE (inches)
male 6 180 12
male 5.92 (5'11") 190 11
male 5.58 (5'7") 170 12
male 5.92 (5'11") 165 10
female 5 100 6
female 5.5 (5'6") 150 8
female 5.42 (5'5") 130 7
female 5.75 (5'9") 150 9
trans 4 200 5
trans 4.10 150 8
trans 5.42 190 7
trans 5.50 150 9
Now, I want to test a person with the following properties (test data) to find his/her sex,
HEIGHT(feet) WEIGHT (lbs) FOOT-SIZE (inches)
4 150 12
This may also be a multi-row matrix.
Suppose, I am able to isolate only the male portion of the data and arrange it in a matrix,

and, I want to find its Parzen Density Function against the following row matrix that represents same data of another person(male/female/transgender),
(dataPoint may have multiple rows.)
so that we can find how closely matches this data with those males.
my attempted solution:

(1) I am unable to calculate the secondPart because of the dimensional mismatch of the matrices. How can I fix this?
(2) Is this approach correct?
MATLAB Code
male = [6.0000 180 12
5.9200 190 11
5.5800 170 12
5.9200 165 10];
dataPoint = [4 150 2]
variance = var(male);
parzen.m
function [retval] = parzen (male, dataPoint, variance)
clc
%male
%dataPoint
%variance
sub = male - dataPoint
up = sub.^2
dw = 2 * variance;
sqr = sqrt(variance*2*pi);
firstPart = sqr.^(-1);
e = dw.^(-1)
secPart = exp((-1)*e*up);
pdf = firstPart.* secPart;
retval = mean(pdf);
bayes.m
function retval = bayes (train, test, aprori)
clc
classCounts = rows(unique(train(:,1)));
%pdfmx = ones(rows(test), classCounts);
%%Parzen density.
%pdf = parzen(train(:,2:end), test(:,2:end), variance);
maxScore = 0;
pdfProduct = 1;
for type = 1 : classCounts
%if(type == 1)
clidxTrain = train(:,1) == type;
%clidxTest = test(:,1) == type;
trainMatrix = train(clidxTrain,2:end);
variance = var(trainMatrix);
pdf = parzen(trainMatrix, test, variance);
%dictionary{type, 1} = type;
%dictionary{type, 2} = prod(pdf);
%pdfProduct = pdfProduct .* pdf;
%end
end
for type=1:classCounts
end
retval = 0;
endfunction
First, your example person has a tiny foot!
Second, it seems you are mixing together kernel density estimation and naive Bayes. In a KDE, you estimate a pdf a sum of kernels, one kernel per data point in your sample. So if you wanted to do a KDE of the height of males, you would add together four Gaussians, each one centered at the height of a different male.
In naive Bayes, you assume that the features (height, foot size, etc.) are independent and that each one is normal distributed. You estimate the parameters of a single Gaussian per feature from your training data, then use their product to get the joint probability of a new example belonging to a certain class. The first page that you link explains this fairly well.
In code:
clear
human = [6.0000 180 12
5.9200 190 11
5.5800 170 12
5.9200 165 10];
tiger = [
2 2000 17
3 1980 16
3.5 2100 18
3 2020 18
4.1 1800 20
];
dataPoints = [
4 150 12
3 2500 20
];
sigSqH = var(human);
muH = mean(human);
sigSqT = var(tiger);
muT = mean(tiger);
for i = 1:size(dataPoints, 1)
i
probHuman = prod( 1./sqrt(2*pi*sigSqH) .* exp( -(dataPoints(i,:) - muH).^2 ./ (2*sigSqH) ) )
probTiger = prod( 1./sqrt(2*pi*sigSqT) .* exp( -(dataPoints(i,:) - muT).^2 ./ (2*sigSqT) ) )
end
Comparing the probability of tiger vs. human lets us conclude that dataPoints(1,:) is a person while dataPoints(2,:) is a tiger. You can make this model more complicated by, e.g., adding prior probabilities of being one class or the other, which would then multiply probHuman or probTiger.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With