Suppose I have the 3x3 matrix below:
[apples 19 3.5]
[oranges 07 2.2]
[grapes 23 7.8]
Only in real life the matrix has dozens of rows, not just three.
I want to create an XY plot where the second column is the X coordinate, the third column is the Y coordinate, and the words themselves (i.e., the first column) are the markers (so no dots, lines, or any other symbols).
I also want the font size of each word to be determined by the second column (in the example above, that means making "grapes" have about three times the size of "oranges", for instance).
Finally, I want to color the words on a red-to-blue scale corresponding to the third column, with 0 = darkest red and 10 = darkest blue.
What's the best way to go about it in Python 2.x? I know I can use matplotlib's "annotate" and "text" to do many (if not all) of those things, but somehow that feels like a workaround. Surely there must be a way of declaring the words to be markers (so I don't have to treat them as "annotations")? Perhaps something outside matplotlib? Has anyone out there ever done something similar?
All of the line properties can be controlled by keyword arguments. For example, you can set the color, marker, linestyle, and markercolor with: plot(x, y, color='green', linestyle='dashed', marker='o', markerfacecolor='blue', markersize=12).
As you did not want to use annotate
or text
the next best thing is py.scatter
which will accept a marker
``'$...$'`` render the string using mathtext.
For example
import pylab as py
data = [["peach", 1.0, 1.0],
["apples", 19, 3.5],
["oranges", 7, 2.2],
["grapes", 23, 7.8]]
for item in data:
py.scatter(item[1], item[2], s=700*item[1],
c=(item[2]/10.0, 0, 1 - item[2]/10.0),
marker=r"$ {} $".format(item[0]), edgecolors='none' )
py.show()
This method has several issues
\textrm{}
in the math text so that it is not italic appears to break matplotlibIt would probably be better to use a colormap rather than simply defining the RGB color value.
While looking around for a solution to the same problem, I've found one that seems a bit cleaner (or at least more in spirit to what the original question asked), namely to use TextPath:
from matplotlib import pyplot as plt
from matplotlib.text import TextPath
data = [["peach", 1.0, 1.0],
["apples", 19, 3.5],
["oranges", 7, 2.2],
["grapes", 23, 7.8]]
max_d2 = max([d[2] for d in data]) + 1e-3
max_d1 = max([d[1] for d in data]) + 1e-3
cmap = plt.get_cmap('RdBu')
for d in data:
path = TextPath((0,0), d[0])
# These dots are to display the weakness below, remove for the actual question
plt.plot(d[1],d[2],'.',color='k')
plt.plot(d[1],d[2],marker=path,markersize=100, color=cmap(d[2]/max_d2))
plt.xlim([0,max_d1+5])
plt.ylim([0,max_d2+0.5])
This solution has some advantages and disadvantages of its own:
Code:
import numpy as np
x = np.cumsum(np.random.randn(100,5), axis=0)
plt.figure(figsize=(15,5))
for i in range(5):
label = TextPath((0,0), str(i), linewidth=1)
plt.plot(x[:,i], color='k')
plt.plot(np.arange(0,len(x),5),x[::5,i], color='k', marker=label, markersize=15, linewidth=0)
Doing the above via a naive loop over "text" or "annotate" would be very slow if you had many lines / markers, while this scales better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With