How to draw a boundary line on a scatter plot for classifier in Julia?

If I want to draw a boundary line to separate two classes which is the result of my classifier. How to draw it? The picture is the sample, the black line is the boundary I want to draw. the green points is the boundary points. I want to draw a curve perfectly fit those points. But when I plot those curve, the result is the purple line which is not a curve.

CHIA YI Avatar asked Nov 25 '21 07:11


2 Answers

Here is a reproducible example how to do it:

using Plots

x = rand(1000)
y = rand(1000)
color = [3 * (b-0.5)^2 < a - 0.1 ? "red" : "blue" for (a, b) in zip(x, y)]

y_bound = 0:0.01:1
x_bound = @. 3 * (y_bound - 0.5)^2 + 0.1

scatter(x, y, color=color, legend=false)
plot!(x_bound, y_bound, color="green")

and you should get a plot like: classification boundary

The crucial thing here is to make your boundary points ordered (i.e. they must be ordered in the vectors properly so that when you plot a line you connect proper points). In my example I achieved it by varying the y-dimension and calculating the x dimension.

In more complex cases it will be better to use contour plot, e.g.:

x = 1:0.1:8
y = 1:0.1:7
f(x, y) = begin
    (3x + y ^ 2) * abs(sin(x) + cos(y)) - 40
X = repeat(reshape(x, 1, :), length(y), 1)
Y = repeat(y, 1, length(x))
Z = map(f, X, Y)
contour(x, y, Z, levels=[0], color="green", width=3)

x_s = 7 .* rand(1000) .+ 1
y_s = 6 .* rand(1000) .+ 1
color = [f(a, b) > 0 ? "red" : "blue" for (a, b) in zip(x_s, y_s)]
scatter!(x_s, y_s, color=color, legend=false)

and you should get something like: boundary with contour plot

However, as you can see this time for the best results it is best to pass scores to contour and specify the classification threshold as level.

Bogumił Kamiński Avatar answered Oct 22 '22 00:10

Bogumił Kamiński

I guess your TA asked you to conduct a grid search for this question.

The meaning of grid search is not searching over the data point you have, but searching over whole coordinate. (I.e. From (0,0), (0,1), (0,2) to (0,100), then to (1,0), (1,1) and so on.) You may change the distance between each point when you conduct a grid search.

In your case, you need to solve the equation d_1(X) = d_2(X). So what you need to do is to simulate some points (like the above example), then put those points into |d_1(X) - d_2(X)|, and pick the points that bring you to a value that is smaller than epsilon (a self-given small number like 0.05 or 0.1). Then use Plot() to connect them.

This is not the most efficient way to create the boundary but this is what you learnt in your tutorial. You may also try contour().

Orca meets avalanche Avatar answered Oct 21 '22 22:10

Orca meets avalanche