I have two numpy arrays, x and y, with 7000 elements each. I want to make a scatter plot of them giving each point a different color depending on these conditions:
-BLACK if x[i]<10.
-RED if x[i]>=10 and y[i]<=-0.5
-BLUE if x[i]>=10 and y[i]>-0.5
I tried creating a list of the same length as the data with the color I want to assign to each point and then plot the data with a loop, but it takes me a long time to run it. Here's my code:
import numpy as np
import matplotlib.pyplot as plt
#color list with same length as the data
col=[]
for i in range(0,len(x)):
if x[i]<10:
col.append('k')
elif x[i]>=10 and y[i]<=-0.5:
col.append('r')
else:
col.append('b')
#scatter plot
for i in range(len(x)):
plt.scatter(x[i],y[i],c=col[i],s=5, linewidth=0)
#add horizontal line and invert y-axis
plt.gca().invert_yaxis()
plt.axhline(y=-0.5,linewidth=2,c='k')
Before that, I tried creating the same color list in the same way, but plotting the data without the loop:
#scatter plot
plt.scatter(x,y,c=col,s=5, linewidth=0)
Even though this plots the data much, much faster than using the for loop, some of the scattered points appear with a wrong color. Why not using a loop to plot the data leads to incorrect color of some points?
I also tried defining three sets of data, one for each color, and adding them to the plot separately. But this is not what I am looking for.
Is there a way to specify in the scatter plots arguments the list of colors I want to use for each point in order not to use the for loop?
PS: This is the plot I get when I don't use the for loop (wrong one):
And this one when I use the for loop (correct):
scatter( x , y , sz , c ) specifies the circle colors. You can specify one color for all the circles, or you can vary the color. For example, you can plot all red circles by specifying c as "red" .
We can again use scatter() function, but this time with the data from the subsetted dataframe df. We also specify the color we want, here we specify the color to be red. Now we have highlighted the select data points, in this case outliers, in red color on a scatter plot.
Matplotlib scatter has a parameter c which allows an array-like or a list of colors. The code below defines a colors dictionary to map your Continent colors to the plotting colors.
This can be done using numpy.where
. Since I do not your exact x and y values I will have to use some fake data:
import numpy as np
import matplotlib.pyplot as plt
#generate some fake data
x = np.random.random(10000)*10
y = np.random.random(10000)*10
col = np.where(x<1,'k',np.where(y<5,'b','r'))
plt.scatter(x, y, c=col, s=5, linewidth=0)
plt.show()
This produces the plot below:
The line col = np.where(x<1,'k',np.where(y<5,'b','r'))
is the important one. This produces a list, the same size as x and y. It fills this list with 'k','b'
or 'r'
depending on the condition that is written before it. So if x is less than 1, 'k'
will be appended to list, else if y is less than 5 'b'
will be appended and if neither of those conditions are met, 'r'
will be appended to the list. This way, you do not have to use a loop to plot your graph.
For your specific data you will have to change the values in the conditions of np.where
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With