I trying to build a tree with BioPython, Phylo module.
What I've done so far is this image:
each name has a four digit number followed by - and a number: this number refer to the number of times that sequence is represented. That means 1578 - 22, that node should represent 22sequences.
the file with the sequences aligned: file
the file with the distance to build a tree: file
So now I known how to change each size of the node. Each node has a different size, this is easy doing an array of the different values:
fh = open(MEDIA_ROOT + "groupsnp.txt")
list_size = {}
for line in fh:
if '>' in line:
labels = line.split('>')
label = labels[-1]
label = label.split()
num = line.split('-')
size = num[-1]
size = size.split()
for lab in label:
for number in size:
list_size[lab] = int(number)
a = array(list_size.values())
But the array is arbitrary, I would like to put the correct node size into the right node, I tried this:
for elem in list_size.keys():
if labels == elem:
Phylo.draw_graphviz(tree_xml, prog="neato", node_size=a)
but nothing appears when I use the if statement.
Anyway of doing this?
I would really appreciate!
Thanks everybody
Each horizontal line in our tree represents a series of ancestors, leading up to the species at its end. For instance, the line leading up to species E represents the species' ancestors since it diverged from the other species in the tree.
Ultrametric trees are trees whose leaves lie at the same distance from the root. They are used to model the genealogy of a population of particles co-existing at the same point in time.
A phylogenetic tree is a graphical representation of the evolutionary relationships between biological entities, usually sequences or species. Relationships between entities are captured by the topology (branching order) and amount of evolutionary change (branch lengths) between nodes.
Phylogenetics is the study of evolutionary relationships among biological entities – often species, individuals or genes (which may be referred to as taxa). The major elements of phylogenetics are summarised in Figure 1 below.
I finally got this working. The basic premise is that you're going to use the labels/nodelist
to build your node_sizes
. This way they correlate properly. I'm sure I'm missing some important options to make the tree look 100% but it appears the node sizes are showing up properly.
#basically a stripped down rewrite of Phylo.draw_graphviz
import networkx, pylab
from Bio import Phylo
#taken from draw_graphviz
def get_label_mapping(G, selection):
for node in G.nodes():
if (selection is None) or (node in selection):
try:
label = str(node)
if label not in (None, node.__class__.__name__):
yield (node, label)
except (LookupError, AttributeError, ValueError):
pass
kwargs={}
tree = Phylo.read('tree.dnd', 'newick')
G = Phylo.to_networkx(tree)
Gi = networkx.convert_node_labels_to_integers(G, discard_old_labels=False)
node_sizes = []
labels = dict(get_label_mapping(G, None))
kwargs['nodelist'] = labels.keys()
#create our node sizes based on our labels because the labels are used for the node_list
#this way they should be correct
for label in labels.keys():
if str(label) != "Clade":
num = label.name.split('-')
#the times 50 is just a guess on what would look best
size = int(num[-1]) * 50
node_sizes.append(size)
kwargs['node_size'] = node_sizes
posi = networkx.pygraphviz_layout(Gi, 'neato', args='')
posn = dict((n, posi[Gi.node_labels[n]]) for n in G)
networkx.draw(G, posn, labels=labels, node_color='#c0deff', **kwargs)
pylab.show()
Resulting Tree
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With