Plotting the score matrix from a Needleman-Wunsch pairwise sequence alignment in matplotlib

Question

I'm trying to draw a matrix according to global alignment algorithm (or Needleman–Wunsch algorithm) in Python.

I don't know if matplotlib is the best tool for this case. I tried to use Bokeh but the structure was so difficult to fit a matrix as I wanted.

I'm using Bio.SeqIO (the standard Sequence Input/Output interface for BioPython) to store two sequences.

I what to get a result similar to this image:

enter image description here

Is that possible in Matplotlib? How can I do that?

UPDATE

Finally I was able to construct the algorithm from the answer given by ImportanceOfBeingErnest. Here is the result:

enter image description here

Here is the gist for this implementation: plot_needleman_wunsch.py

And here is the whole project (Work in progress): bmc-sequence-alignment

ImportanceOfBeingErnest · Accepted Answer

There is no clear description of the algorithm to place the arrows in the question; therefore this answer focusses on the way to procude a similar plot in matplotlib.

enter image description here

The idea here is to place the numbers at integer positions in the plot and draw minor gridlines at n+0.5 to obtain the table-like appearance. The arrows are drawn as annotations between positions defined in a 4-column array (first 2 column: x and y of the start of the arrow, third and fourth column: x,y of arrow's end).

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np;np.random.seed(5)
plt.rcParams["figure.figsize"] = 4,5
param = {"grid.linewidth" : 1.6,
         "grid.color"     : "lightgray",
         "axes.linewidth" : 1.6,
         "axes.edgecolor" : "lightgray"}
plt.rcParams.update(param)

#Data
headh = list("GATCCA")
headv = list("GTGCCT")

v = np.zeros((7,7), dtype=int)
v[1:,1:] = np.random.randint(-2,7, size=(6,6))

arrows = np.random.randint(0,v.shape[1], size=(14,4))
opt = np.array([(0,1),(1,0),(1,1)])
arrows[:,2:] = arrows[:,:2] + opt[np.random.randint(0,3,size=14 )]

arrowsb = np.random.randint(0,v.shape[1], size=(7,4))
optb = np.array([(0,1),(1,0),(1,1)])
arrowsb[:,2:] = arrowsb[:,:2] + optb[np.random.randint(0,3,size=7 )]

#Plot
fig, ax=plt.subplots()
ax.set_xlim(-1.5, v.shape[1]-.5 )
ax.set_ylim(-1.5, v.shape[0]-.5 )
ax.invert_yaxis()
for i in range(v.shape[0]):
    for j in range(v.shape[1]):
        ax.text(j,i,v[i,j], ha="center", va="center")
for i, l in enumerate(headh):
    ax.text(i+1,-1,l, ha="center", va="center", fontweight="semibold")
for i, l in enumerate(headv):
    ax.text(-1,i+1,l, ha="center", va="center", fontweight="semibold")

ax.xaxis.set_minor_locator(ticker.FixedLocator(np.arange(-1.5, v.shape[1]-.5,1)))
ax.yaxis.set_minor_locator(ticker.FixedLocator(np.arange(-1.5, v.shape[1]-.5,1)))
plt.tick_params(axis='both', which='both', bottom='off', top='off', 
                left="off", right="off", labelbottom='off', labelleft='off')
ax.grid(True, which='minor')


arrowprops=dict(facecolor='crimson',alpha=0.5, lw=0, 
                shrink=0.2,width=2, headwidth=7,headlength=7)
for i in range(arrows.shape[0]):
    ax.annotate("", xy=arrows[i,2:], xytext=arrows[i,:2], arrowprops=arrowprops)
arrowprops.update(facecolor='blue')
for i in range(arrowsb.shape[0]):
    ax.annotate("", xy=arrowsb[i,2:], xytext=arrowsb[i,:2], arrowprops=arrowprops)
plt.show()

Plotting the score matrix from a Needleman-Wunsch pairwise sequence alignment in matplotlib

Tags:

python

matplotlib

bioinformatics

biopython

Kevin Hernández

1 Answers

ImportanceOfBeingErnest

Recent Activity

Donate For Us

Plotting the score matrix from a Needleman-Wunsch pairwise sequence alignment in matplotlib

Tags:

python

matplotlib

bioinformatics

biopython

Kevin Hernández

1 Answers

ImportanceOfBeingErnest

Related questions

Recent Activity

Donate For Us