Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-overlapping scatter plot labels using matplotlib

I have a scatter plot with a number of points. Each point has a string associated with it (varying in length) that I'd like to supply a label, but I can't fit them all. So I'd like to iterating through my data points from most to least important, and in each case apply a label only if it would not overlap as existing label. The strings vary in length. One of the commenters mentions solving a knapsack problem to find an optimal solution. In my case the greedy algorithm (always label the most important remaining point that can be labeled without overlap) would be a good start and might suffice.

Here's a toy example. Could I get Python to label only as many points as it can without overlapping?

import matplotlib.pylab as plt, numpy as np

npoints = 100
xs = np.random.rand(npoints)
ys = np.random.rand(npoints)

plt.scatter(xs, ys)

labels = iter(dir(np))
for x, y, in zip(xs, ys):
    # Ideally I'd condition the next line on whether or not the new label would overlap with an existing one
    plt.annotate(labels.next(), xy = (x, y))
plt.show()
like image 395
kuzzooroo Avatar asked Aug 26 '14 22:08

kuzzooroo


1 Answers

You can draw all the annotates first, and then use a mask array to check the overlap and use set_visible() to hide. Here is an example:

import numpy as np
import pylab as pl
import random
import string
import math
random.seed(0)
np.random.seed(0)
n = 100
labels = ["".join(random.sample(string.ascii_letters, random.randint(4, 10))) for _ in range(n)]
x, y = np.random.randn(2, n)

fig, ax = pl.subplots()

ax.scatter(x, y)

ann = []
for i in range(n):
    ann.append(ax.annotate(labels[i], xy = (x[i], y[i])))

mask = np.zeros(fig.canvas.get_width_height(), bool)

fig.canvas.draw()

for a in ann:
    bbox = a.get_window_extent()
    x0 = int(bbox.x0)
    x1 = int(math.ceil(bbox.x1))
    y0 = int(bbox.y0)
    y1 = int(math.ceil(bbox.y1))

    s = np.s_[x0:x1+1, y0:y1+1]
    if np.any(mask[s]):
        a.set_visible(False)
    else:
        mask[s] = True

the output:

enter image description here

like image 78
HYRY Avatar answered Oct 27 '22 06:10

HYRY