Consider the following training data sets..
+-------+-------+----------+-------------+ | Size | Color | Shape | Class/Label | +=======+=======+==========+=============+ | big | red | circle | No | | small | red | triangle | No | | small | red | circle | Yes | | big | blue | circle | No | | small | blue | circle | Yes | +-------+-------+----------+-------------+
I would like to understand how the algorithm proceeds when it starts with a negative example and when two negative examples come together.
This is not an assignment question by the way.
Examples with other data sets are also welcome! This is to understand the negative part of the algorithm.
The candidate elimination algorithm incrementally builds the version space given a hypothesis space H and a set E of examples. The examples are added one by one; each example possibly shrinks the version space by removing the hypotheses that are inconsistent with the example.
Introduction. In this tutorial, we'll explain the Candidate Elimination Algorithm (CEA), which is a supervised technique for learning concepts from data.
The candidate elimination algorithm is sometimes said to be an unbiased learning algorithm because the learning algorithm does not impose any bias beyond the language bias involved in choosing H.
Candidate Elimination Algorithm Issues Will it converge to the correct hypothesis? Yes, if (1) the training examples are error free and (2) the correct hypothesis can be represented by a conjunction of attributes.
For your hypothesis space (H), you start with your sets of maximally general (G) and maximally specific (S) hypotheses:
G0 = {<?, ?, ?>} S0 = {<0, 0, 0>}
When you are presented with a negative example, you need to remove from S any hypothesis inconsistent with the current observation and replace any inconsistent hypothesis in G with its minimal specializations that are consistent with the observation but still more general than some member of S.
So for your first (negative) example, (big, red, circle)
, the minimal specializations would make the new hypothesis space
G1 = {<small, ? , ?>, <?, blue, ?>, <?, ?, triangle>} S1 = S0 = {<0, 0, 0>}
Note that S did not change. For your next example, (small, red, triangle)
, which is also negative, you will need to further specialize G. Note that the second hypothesis in G1 does not match the new observation so only the first and third hypotheses in G1 need to be specialized. That would yield
G2 = {<small, blue, ?>, <small, ?, circle>, <?, blue, ?>, <big, ?, triangle>, <?, blue, triangle>}
However, since the first and last hypotheses in G2 above are specializations of the middle hypothesis (<?, blue, ?>
), we drop those two, giving
G2 = {<small, ?, circle>, <?, blue, ?>, <big, ?, triangle>} S2 = S1 = S0 = {<0, 0, 0>}
For the positive (small, red, circle)
observation, you must generalize S and remove anything in G that is inconsistent, which gives
G3 = {<small, ?, circle>} S3 = {<small, red, circle>}
(big, blue, circle)
is the next negative example. But since it in not consistent with G, there is nothing to do so
G4 = G3 = {<small, ?, circle>} S4 = S3 = {<small, red, circle>}
Lastly, you have the positive example of (small, blue, circle)
, which requires you to generalize S to make it consistent with the example, giving
G5 = {<small, ?, circle>} S5 = {<small, ?, circle>}
Since G and S are equal, you have learned the concept of "small circles".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With