Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting dendrogram in Scipy error for large dataset

I am using Scipy for hierarchial clustering. I do manage to get flat clusters on a threshold using fcluster. But I need to visualize the dendrogram formed. When I use the dendrogram method, it works fine for 5-6k user vectors. But my dataser consists of 16k user vectors. When I run it for 16k users dendrogram function throws the following error:

File "/home/enthought/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2333, in _dendrogram_calculate_info
leaf_label_func, i, labels)
File "/home/enthought/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2205, in _append_singleton_leaf_node
ivl.append(str(int(i)))
RuntimeError: maximum recursion depth exceeded while getting the str of an object

Any ideas on visualizing dendrogram for larger dataser?

like image 622
Maxwell Avatar asked Apr 18 '12 06:04

Maxwell


2 Answers

This may be a bit late, but if you feel comfortable with increasing your recursion limit to subvert the recursion depth limit, you could do so. It's not recommended, and definitely not 'pythonic', but it will likely get you the results you want.

import sys
sys.setrecursionlimit(10000)
like image 84
VedTopkar Avatar answered Sep 21 '22 17:09

VedTopkar


Using sys.setrecursionlimit(1000000) I was able to process a large matrix and successfully return a seaborn.clustermap call. I imagine that this error could also be possibly resolved by upgrading scipy or supplying additional arguments and building a clustermap more thoughtfully using scipy.

like image 33
user12288120 Avatar answered Sep 18 '22 17:09

user12288120