Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating the entropy of an attribute in the ID3 algorithm when a split is perfectly classified

I have been reading about the ID3 algorithm recently and it says that the best attribute to be selected for splitting should result in the maximum information gain which can be computed with the help of the entropy.

I have written a simple python program to compute the entropy. It is shown below:

def _E(p, n):
    x = (p/(p+n))
    y = (n/(p+n))
    return(-1* (x*math.log2(x)) -1* (y*math.log2(y)))

However suppose we have a table consisting of 10 elements as follows:

x = [1, 0, 1, 0, 0, 0, 0, 0, 0, 0]

y = [1, 1, 1, 0, 1, 0, 1, 0, 1, 0]

Where x is the attribute and y is the class. Here P(0) = 0.8 and P(1) = 0.2. The entropy will be as follows:

Entropy(x) = 0.8*_E(5, 3) + 0.2*_E(2, 0)

However the second split P(1) is perfectly classified and this results in a math error since log2(0) is negative infinity. How is the entropy calculated in such cases?

like image 744
Arat254 Avatar asked Oct 28 '25 09:10

Arat254


1 Answers

Entropy is a measure of impurity. So if a node is pure it means entropy is zero.

Have a look at this -

def information_gain(data, column, cut_point):
    """
    For calculating the goodness of a split. The difference of the entropy of parent and 
    the weighted entropy of children.
    :params:attribute_index, labels of the node t as `labels` and cut point as `cut_point`
    :returns: The net entropy of partition 
    """
    subset1, subset2 = divide_data(data, column, cut_point) 
    lensub1, lensub2 = len(subset1), len(subset2)  
    #if the node is pure return 0 entropy
    if len(subset1) == 0 or len(subset2) == 0:
        return (0, subset1, subset2)     
    weighted_ent = (len(subset1)*entropy(subset1) + len(subset2)*entropy(subset2)) / len(data)  
    return ((entropy(data) - weighted_ent), subset1, subset2)
like image 103
hashcode55 Avatar answered Oct 31 '25 13:10

hashcode55



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!