Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating and graphing Hierarchical Trees in Python with pandas

So I have hierarchical information stored within a pandas DataFrame and I would like to construct and visualize a hierarchical tree based on this information.

For example, a row in my DataFrame has the column headings — ['Phylum','Class','Order','Family','Genus','Species','Subspecies']

and I want to create a tree with each row, where all 'Subspecies' are unique strings and should be leaves in the tree. Can someone point me to the best method/package etc... for doing this? ideally the output will be a matplotlib object. Thank you in advance!

like image 471
Wes Field Avatar asked Oct 21 '22 00:10

Wes Field


1 Answers

You can easily get them in a hierachical index with groupby:

taxons = ['Phylum','Class','Order','Family','Genus','Species','Subspecies']
hierarchical_df = my_dataframe.groupby(taxons).sum() #sum or whatever is most appropiate for your data

From there, I'm also trying to do a meaningful plot showing that hierachy (see Hierarchic pie/donut chart from Pandas DataFrame using bokeh or matplotlib?)

like image 192
Adrià Cereto i Massagué Avatar answered Oct 27 '22 09:10

Adrià Cereto i Massagué