Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Matplotlib scatter plot: Specify color points depending on conditions [duplicate]

I have two numpy arrays, x and y, with 7000 elements each. I want to make a scatter plot of them giving each point a different color depending on these conditions:

-BLACK if x[i]<10.

-RED if x[i]>=10 and y[i]<=-0.5

-BLUE if x[i]>=10 and y[i]>-0.5 

I tried creating a list of the same length as the data with the color I want to assign to each point and then plot the data with a loop, but it takes me a long time to run it. Here's my code:

import numpy as np
import matplotlib.pyplot as plt

#color list with same length as the data
col=[]
for i in range(0,len(x)):
    if x[i]<10:
        col.append('k') 
    elif x[i]>=10 and y[i]<=-0.5:
        col.append('r') 
    else:
        col.append('b') 

#scatter plot
for i in range(len(x)):
    plt.scatter(x[i],y[i],c=col[i],s=5, linewidth=0)

#add horizontal line and invert y-axis
plt.gca().invert_yaxis()
plt.axhline(y=-0.5,linewidth=2,c='k')

Before that, I tried creating the same color list in the same way, but plotting the data without the loop:

#scatter plot
plt.scatter(x,y,c=col,s=5, linewidth=0)

Even though this plots the data much, much faster than using the for loop, some of the scattered points appear with a wrong color. Why not using a loop to plot the data leads to incorrect color of some points?

I also tried defining three sets of data, one for each color, and adding them to the plot separately. But this is not what I am looking for.

Is there a way to specify in the scatter plots arguments the list of colors I want to use for each point in order not to use the for loop?

PS: This is the plot I get when I don't use the for loop (wrong one):

enter image description here

And this one when I use the for loop (correct):

enter image description here

like image 689
Argumanez Avatar asked Nov 25 '16 11:11

Argumanez


People also ask

How do you specify colors in a scatter plot?

scatter( x , y , sz , c ) specifies the circle colors. You can specify one color for all the circles, or you can vary the color. For example, you can plot all red circles by specifying c as "red" .

How do you highlight a point on a scatter plot in Python?

We can again use scatter() function, but this time with the data from the subsetted dataframe df. We also specify the color we want, here we specify the color to be red. Now we have highlighted the select data points, in this case outliers, in red color on a scatter plot.

Is it possible to create a colored scatter plot using matplotlib?

Matplotlib scatter has a parameter c which allows an array-like or a list of colors. The code below defines a colors dictionary to map your Continent colors to the plotting colors.


1 Answers

This can be done using numpy.where. Since I do not your exact x and y values I will have to use some fake data:

import numpy as np
import matplotlib.pyplot as plt

#generate some fake data
x = np.random.random(10000)*10
y = np.random.random(10000)*10

col = np.where(x<1,'k',np.where(y<5,'b','r'))

plt.scatter(x, y, c=col, s=5, linewidth=0)
plt.show()

This produces the plot below:

enter image description here

The line col = np.where(x<1,'k',np.where(y<5,'b','r')) is the important one. This produces a list, the same size as x and y. It fills this list with 'k','b' or 'r' depending on the condition that is written before it. So if x is less than 1, 'k' will be appended to list, else if y is less than 5 'b' will be appended and if neither of those conditions are met, 'r' will be appended to the list. This way, you do not have to use a loop to plot your graph.

For your specific data you will have to change the values in the conditions of np.where.

like image 120
DavidG Avatar answered Oct 08 '22 02:10

DavidG