I do know formula for calculating entropy:
H(Y) = - ∑ (p(yj) * log2(p(yj)))
In words, select an attribute and for each value check target attribute value ... so p(yj) is the fraction of patterns at Node N are in category yj - one for true in target value and one one for false.
But I have a dataset in which target attribute is price, hence range. How to calculate entropy for this kinda dataset?
(Referred: http://decisiontrees.net/decision-trees-tutorial/tutorial-5-exercise-2/)
Entropy is a measure of disorder or uncertainty and the goal of machine learning models and Data Scientists in general is to reduce uncertainty. We simply subtract the entropy of Y given X from the entropy of just Y to calculate the reduction of uncertainty about Y given an additional piece of information X about Y.
For example, in a binary classification problem (two classes), we can calculate the entropy of the data sample as follows: Entropy = -(p(0) * log(P(0)) + p(1) * log(P(1)))
Entropy is the measurement of disorder or impurities in the information processed in machine learning. It determines how a decision tree chooses to split data. We can understand the term entropy with any simple example: flipping a coin. When we flip a coin, then there can be two outcomes.
You first need to discretise the data set in some way, like sorting it numerically into a number of buckets. Many methods for discretisation exist, some supervised (ie taking account the value of your target function) and some not. This paper outlines various techniques used in fairly general terms. For more specifics there are plenty of discretisation algorithms in machine learning libraries like Weka.
The entropy of continuous distributions is called differential entropy, and can also be estimated by assuming your data is distributed in some way (normally distributed for example), then estimating underlaying distribution in the normal way, and using this to calculate an entropy value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With