Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to dump a boolean matrix in numpy?

I have a graph represented as a numpy boolean array (G.adj.dtype == bool). This is homework in writing my own graph library, so I can't use networkx. I want to dump it to a file so that I can fiddle with it, but for the life of me I can't work out how to make numpy dump it in a recoverable fashion.

I've tried G.adj.tofile, which wrote the graph correctly (ish) as one long line of True/False. But fromfile barfs on reading this, giving a 1x1 array, and loadtxt raises a ValueError: invalid literal for int. np.savetxt works but saves the matrix as a list of 0/1 floats, and loadtxt(..., dtype=bool) fails with the same ValueError.

Finally, I've tried networkx.from_numpy_matrix with networkx.write_dot, but that gave each edge [weight=True] in the dot source, which broke networkx.read_dot.

like image 653
Katriel Avatar asked Dec 23 '10 02:12

Katriel


3 Answers

To save:

numpy.savetxt('arr.txt', G.adj, fmt='%s')

To recover:

G.adj = numpy.genfromtxt('arr.txt', dtype=bool)

HTH!

like image 149
Hugh Bothwell Avatar answered Oct 24 '22 00:10

Hugh Bothwell


This is my test case:

m = numpy.random(100,100) > 0.5

space efficiency

numpy.savetxt('arr.txt', obj, fmt='%s') creates a 54 kB file.

numpy.savetxt('arr.txt', obj, fmt='%d') creates a much smaller file (20 kB).

cPickle.dump(obj, open('arr.dump', 'w')), which creates a 40kB file,

time efficiency

numpy.savetxt('arr.txt', obj, fmt='%s') 45 ms

numpy.savetxt('arr.txt', obj, fmt='%d') 10 ms

cPickle.dump(obj, open('arr.dump', 'w')), 2.3 ms

conclusion

use savetxt with text format (%s) if human readability is needed, use savetxt with numeric format (%d) if space consideration are an issue and use cPickle if time is an issue.

like image 38
Boris Gorelik Avatar answered Oct 24 '22 00:10

Boris Gorelik


The easiest way to save your array including metadata (dtype, dimensions) is to use numpy.save() and numpy.load():

a = array([[False,  True, False],
           [ True, False,  True],
           [False,  True, False],
           [ True, False,  True],
           [False,  True, False]], dtype=bool)
numpy.save("data.npy", a)
numpy.load("data.npy")
# array([[False,  True, False],
#        [ True, False,  True],
#        [False,  True, False],
#        [ True, False,  True],
#        [False,  True, False]], dtype=bool)

a.tofile() and numpy.fromfile() would work as well, but don't save any metadata. You need to pass dtype=bool to fromfile() and will get a one-dimensional array that must be reshape()d to its original shape.

like image 4
Sven Marnach Avatar answered Oct 23 '22 22:10

Sven Marnach