Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VC Dimension of Circle, a special case

I have read that a circle can shatter 3 point in 2D space, which is actually the VC Dimension of Circle.

Let's say we have three points (5,2) (5,4) and (5,6). How can I draw a circle where (5,2) & (5,6) are included with out (5,4)? That's not possible!
If it can not shatter then how come VC Dimension is 3 for a circle. Or am I wrong in assuming that in the definition of VC Dimension; a hypothesis has to shatter all possible scenarios of all possible subsets of space?

like image 951
Asymptote Avatar asked Sep 18 '12 11:09

Asymptote


2 Answers

The VC dimension is the maximum number of points that can be shattered. {(5,2), (5,4), (5,6)} cannot be shattered by circles, but {(5,2), (5,4), (6,6)} can be shattered by circles, so the VC dimension is at least 3. Proving that it is exactly 3 is harder.

There is a technical point here related to Qnan's answer. If the circle classifier always classifies the points inside the circle as 1, and those outside the circle as 0, then {(5,2), (5,4), (5,6)} cannot be shattered. On the other hand, if the circle classifier can also classify the points inside the circle as 0, then {(5,2), (5,4), (5,6)} can be shattered as explained by Qnan.

Qnan, with regard to your comment, if one says that n is the largest number of points with property P, then to prove that n >= m, it suffices to find any collection of m points with property P. If you find one or one thousand sets of m points that do not have property P, then that does not prove anything about n. (Unless you have enumerated every possible set of points of size m.)

The VC dimension is the maximal number of points that can be shattered. If the VC dimension of a classifier is 100, it still could be possible to find 3 points that cannot be shattered by the classifier. We could define the VCB dimension to be the largest number n such that all sets of size n or less can be shattered. Asymptote's original example shows that the VCB dimension of circular classifiers (assuming 1 inside the circle and 0 outside the circle) on the Cartesian plane is less than or equal to 2 because those three points can't be shattered; however, Asymptote's example does not show that the VC dimension is less than 3 because there are other sets of points of size 3 which can be shattered.

like image 119
hans Avatar answered Sep 18 '22 15:09

hans


The point is that the circle can be drawn such that all the points belonging to one class are on the inside, while the rest are on the outside. It does not matter which class is which, since swapping the labels will only require the classifier to be inverted.

In your case, separating (5,2) and (5,6) from (5,4) is done trivially by including only the latter in the circle. For a classifier, "inside" and "outside" don't matter. What does matter is that they can be classified with a 0 error.

EDIT Strictly speaking, VC dimension is defined for parameterized classifiers, and there are multiple classifiers, whose boundary would be described as a "circle". For example, f(x)=(x-x0)*(x-x0) cannot shatter a set of three points on one line, but f(x)=a*(x-x0)*(x-x0) can, and both classifiers have circular boundary. The VC dimension of the second one is actually 3, while that of the first one is 2.

like image 42
Qnan Avatar answered Sep 21 '22 15:09

Qnan