I have a list of list of lists like this
matches = [[['rootrank', 'Root'], ['domain', 'Bacteria'], ['phylum', 'Firmicutes'], ['class', 'Clostridia'], ['order', 'Clostridiales'], ['family', 'Lachnospiraceae'], ['genus', 'Lachnospira']],
[['rootrank', 'Root'], ['domain', 'Bacteria'], ['phylum', '"Proteobacteria"'], ['class', 'Gammaproteobacteria'], ['order', '"Vibrionales"'], ['family', 'Vibrionaceae'], ['genus', 'Catenococcus']],
[['rootrank', 'Root'], ['domain', 'Archaea'], ['phylum', '"Euryarchaeota"'], ['class', '"Methanomicrobia"'], ['order', 'Methanomicrobiales'], ['family', 'Methanomicrobiaceae'], ['genus', 'Methanoplanus']]]
And I want to construct a phylogenetic tree from them. I wrote a node class like so (based partially on this code):
class Node(object):
"""Generic n-ary tree node object
Children are additive; no provision for deleting them."""
def __init__(self, parent, category=None, name=None):
self.parent = parent
self.category = category
self.name = name
self.childList = []
if parent is None:
self.birthOrder = 0
else:
self.birthOrder = len(parent.childList)
parent.childList.append(self)
def fullPath(self):
"""Returns a list of children from root to self"""
result = []
parent = self.parent
kid = self
while parent:
result.insert(0, kid)
parent, kid = parent.parent, parent
return result
def ID(self):
return '{0}|{1}'.format(self.category, self.name)
And then I try to construct my tree like this:
node = None
for match in matches:
for branch in match:
category, name = branch
node = Node(node, category, name)
print [n.ID() for n in node.fullPath()]
This works for the first match, but when I start with the second match it is appended at the end of the tree instead of starting again at the top. How would I do that? I tried some variations on searching for the ID, but I can't get it to work.
I would highly recommend using a phylogenetics library like Dendropy.
The 'standard way of writing phylogenetic trees is with the Newick format (parenthetical statements like ((A,B),C)). If you use Dendropy, reading that tree would be as simple as
>>> import dendropy
>>> tree1 = dendropy.Tree.get_from_string("((A,B),(C,D))", schema="newick")
or to read from a stream
>>> tree1 = dendropy.Tree(stream=open("mle.tre"), schema="newick")
The creator of the library maintains a nice tutorial too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With