Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib.colors.ListedColormap in python

def plot_decision_regions(X, y, classifier, resolution=0.02):
    # setup marker generator and color map
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])
    # plot the decision surface
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
        np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())
    # plot class samples
    for idx, cl in enumerate(np.unique(y)):
    plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1],alpha=0.8, 
 c=cmap(idx),marker=markers[idx], label=cl)

I was going through perceptron training in machine learning from python machine learning and found this code.In function argument classifier represent perceptron and X is input features y is output vector.I Don't understand what does ListedColormap do ?In addition to that what does meshgrid do ? I am newbie to pandas matplotlib library kindly explain me this code and what we want to do in this code?

like image 297
Vipin Dubey Avatar asked Jun 08 '17 19:06

Vipin Dubey


People also ask

How do I specify colors on a matplotlib plot?

The usual way to set the line color in matplotlib is to specify it in the plot command. This can either be done by a string after the data, e.g. "r-" for a red line, or by explicitely stating the color argument.

What colors are recognized by Python?

Practical Data Science using Python You can also use any locally defined standard color name. The colors "white", "black", "red", "green", "blue", "cyan", "yellow", and "magenta" will always be available.


1 Answers

Let me explain every single line of the code:

x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1

x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1

this part of the code deals in creating our limits in the graph. To make the graph less clumsy and clear,upper limit is increased by 1 and lower limit is decreased by 1.This helps our classification model not to touch the axes in case.

xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))

meshgrid method in numpy creates a co-ordinate matrix using co-ordinate vectors. here, to generalize, a rectangular(mesh) is formed whose length is x1_max-x1_min and breadth is x2_max-x2_min.

np.arange(start,stop,step): here, starting and endings are set and resolution is taken as step size. if resolution was larger than 0.02 (may be 2), the plotted points are clearly visible to human eye. inorder to create a completely smoother region, resolution is set to minimum necessary.

if you have co-ordinates

(-1,-2) (-1,0) (-1,1)

(0,-2) (0,0) (0,1)

(1,-2) (1,0) (1,1)

then meshgrid method converts it to 2 3X3 matrices

xx1 = [-1 -1 -1][0 0 0] [1 1 1] (3X3 matrix)

xx2 = [-2 -2 -2][0 0 0] [1 1 1] (3X3 matrix)

moving on to the next step,

Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T) Z = Z.reshape(xx1.shape)

.ravel() method of numpy, creates a flattened 1D array here. as in the above example,

xx1.ravel() = [-1 -1 -1 0 0 0 1 1 1]

xx2.ravel() = [-2 -2 -2 0 0 0 1 1 1]

numpy.array() concatenates both the vectors into a single 2 X 9 array:

this gives,

[-1 -1 -1 0 0 0 1 1 1][-2 -2 -2 0 0 0 1 1 1] (2X9 matrix)

for this matrix, using .T, transpose is found. when transpose is done, this returns a 9x2 matrix. in which each row represents a co-ordinate pair. this obtained matrix is reshaped.

plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)

plt.xlim(xx1.min(), xx1.max())

plt.ylim(xx2.min(), xx2.max())

contourf is used to plot contour plots. here, Z forms our classifier in the space of xx1 x xx2. and plot limits are assigned.

Finally,

for idx, cl in enumerate(np.unique(y)):

plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1],alpha=0.8,

c=cmap(idx),marker=markers[idx], label=cl)

in this part, available data points are plotted.

np.unique() returns a matrix of unique values. if your model has 2 outputs i.e. either yes or no, then whole data is classified into 2 categories.

enumerate() method returns count and value. for example:

elements = ('foo', 'bar', 'baz')

for count, elem in enumerate(elements)

... print count, elem

...

0 foo

1 bar

2 baz

so, in the above code , idx returns 0 for all points with "no" or 1 for all points with "yes". from

cmap = ListedColormap(colors[:len(np.unique(y))]), cmap(0) returns the first color to all the scattered points present under that category.

when loop is executed, then all data points belonging to a particular category are assigned same color and plotted in the graph.

Label creates a bar which enables us to know what value a particular color refers to.

This is how, classifiers are generally visualized.

like image 62
Nikhila Munipalli Avatar answered Oct 02 '22 14:10

Nikhila Munipalli