Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python scatter-plot: Conditions for marker styles?

I have a data set I wish to plot as scatter plot with matplotlib, and a vector the same size that categorizes and labels the data points (discretely, e.g. from 0 to 3). I want to use different markers for different labels (e.g. 'x' for 0, 'o' for 1 and so on). How can I solve this elegantly? I am quite sure I am just missing out on something, but didn't really find it, and my naive approaches failed so far...

like image 831
lu_siyah Avatar asked Jan 14 '23 07:01

lu_siyah


2 Answers

What about iterating over all markers like this:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.rand(100)
y = np.random.rand(100)
category = np.random.random_integers(0, 3, 100)

markers = ['s', 'o', 'h', '+']
for k, m in enumerate(markers):
    i = (category == k)
    plt.scatter(x[i], y[i], marker=m)

plt.show()
like image 103
David Zwicker Avatar answered Jan 21 '23 07:01

David Zwicker


Matplotlib does not accepts different markers per plot.

However, a less verbose and more robust solution for large dataset is using the pandas and seaborn library:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

x = [48.959, 49.758, 49.887, 50.593, 50.683 ]
y = [122.310, 121.29, 120.525, 120.252, 119.509]
z = [136.993, 133.128, 143.710, 129.088, 139.860]
kmean = np.array([0, 1, 0, 2, 2])

df = pd.DataFrame({'x':x,'y':y,'z':z, 'km_z':kmean})
sns.scatterplot(data = df, x='x', y='y', hue='km_z', style='km_z')

which produces the following output

enter image description here

Additionally you can use the pandas.cut function to plot bins (Its something I regularly need to produce graphs where I can use a third continuous value as a parameter). The way to use it is :

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
x = [48.959, 49.758, 49.887, 50.593, 50.683 ]
y = [122.310, 121.29, 120.525, 120.252, 119.509]
z = [136.993, 133.128, 143.710, 129.088, 139.860]

df = pd.DataFrame({'x':x,'y':y,'z':z})
df['bins'] = pd.cut(df.z, bins=3)
sns.scatterplot(data = df, x='x', y='y', hue='bins', style='bins')

and it produces the following example:

enter image description here


I've used the latter method to produce graphs like the following:

enter image description here

like image 41
NMech Avatar answered Jan 21 '23 06:01

NMech