Saving KDTree object in Python?

Question

I am using Scipy's KDTree implementation to read a large file of 300 MB. Now, is there a way I can just save the datastructure to disk and load it again or am I stuck with reading raw points from file and constructing the data structure each time I start my program? I am constructing the KDTree as follows:

def buildKDTree(self):
        self.kdpoints = numpy.fromfile("All", sep=' ')
        self.kdpoints.shape = self.kdpoints.size / self.NDIM, NDIM
        self.kdtree = KDTree(self.kdpoints, leafsize = self.kdpoints.shape[0]+1)
        print "Preparing KDTree... Ready!"

Any suggestions please?

samplebias · Accepted Answer

KDtree uses nested classes to define its node types (innernode, leafnode). Pickle only works on module-level class definitions, so a nested class trips it up:

import cPickle

class Foo(object):
    class Bar(object):
        pass

obj = Foo.Bar()
print obj.__class__
cPickle.dumps(obj)

<class '__main__.Bar'>
cPickle.PicklingError: Can't pickle <class '__main__.Bar'>: attribute lookup __main__.Bar failed

However, there is a (hacky) workaround by monkey-patching the class definitions into the scipy.spatial.kdtree at module scope so the pickler can find them. If all of your code which reads and writes pickled KDtree objects installs these patches, this hack should work fine:

import cPickle
import numpy
from scipy.spatial import kdtree

# patch module-level attribute to enable pickle to work
kdtree.node = kdtree.KDTree.node
kdtree.leafnode = kdtree.KDTree.leafnode
kdtree.innernode = kdtree.KDTree.innernode

x, y = numpy.mgrid[0:5, 2:8]
t1 = kdtree.KDTree(zip(x.ravel(), y.ravel()))
r1 = t1.query([3.4, 4.1])
raw = cPickle.dumps(t1)

# read in the pickled tree
t2 = cPickle.loads(raw)
r2 = t2.query([3.4, 4.1])
print t1.tree.__class__
print repr(raw)[:70]
print t1.data[r1[1]], t2.data[r2[1]]

Output:

<class 'scipy.spatial.kdtree.innernode'>
"ccopy_reg
_reconstructor
p1
(cscipy.spatial.kdtree
KDTree
p2
c_
[3 4] [3 4]

Saving KDTree object in Python?

Tags:

python

serialization

numpy

pickle

scipy

Legend

1 Answers

samplebias

Recent Activity

Donate For Us

Saving KDTree object in Python?

Tags:

python

serialization

numpy

pickle

scipy

Legend

1 Answers

samplebias

Related questions

Recent Activity

Donate For Us