Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add labels to t-SNE in python

I'm using t-SNE to searching for relations on a dataset which have seven features.

enter image description here

I'm using a dictionary to assing colors to the y labels on the plot:

encoding = {'d0': 0, 'd1': 1, 'd2': 2, 'd3': 3, 'd4': 4, 'd5': 5, 'd6': 6, 'd7': 7}

plt.scatter(X_tsne[:, 0], X_tsne[:, 1], c=y['label'].apply(lambda x: city_encoding[x]))
plt.show()

The problem here is that is not clear which color corresponds to which label. The dataset actually has over 100 labels, so it's not something I'd like to handle manually.

enter image description here

like image 776
Luis Ramon Ramirez Rodriguez Avatar asked Oct 18 '17 21:10

Luis Ramon Ramirez Rodriguez


1 Answers

You can plot each category separately on the same axes, and let Matplotlib generate the colors and legend:

fig, ax = plt.subplots()

groups = pd.DataFrame(X_tsne, columns=['x', 'y']).assign(category=y).groupby('category')
for name, points in groups:
    ax.scatter(points.x, points.y, label=name)

ax.legend()

For randomly generated X, this gives

enter image description here

like image 105
Igor Raush Avatar answered Oct 22 '22 03:10

Igor Raush