Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matrix operations with gigantic matrices in python [duplicate]

Does anybody know how to work with gigantic matrices in python? I have to work with adjacency matrices of shape (10^6,10^6) and perform operations including addition, scaling and dot product. Using numpy arrays I had problems with the ram.

like image 901
AlwaysLookBright Avatar asked Mar 25 '13 23:03

AlwaysLookBright


1 Answers

How about something like this...

import numpy as np

# Create large arrays x and y.
# Note they are 1e4 not 1e6 b/c of memory issues creating random numpy matrices (CookieOfFortune) 
# However, the same principles apply to larger arrays
x = np.random.randn(10000, 10000)
y = np.random.randn(10000, 10000)

# Create memory maps for x and y arrays
xmap = np.memmap('xfile.dat', dtype='float32', mode='w+', shape=x.shape)
ymap = np.memmap('yfile.dat', dtype='float32', mode='w+', shape=y.shape)

# Fill memory maps with data
xmap[:] = x[:]
ymap[:] = y[:]

# Create memory map for out of core dot product result
prodmap = np.memmap('prodfile.dat', dtype='float32', mode='w+', shape=x.shape)

# Due out of core dot product and write data
prodmap[:] = np.memmap.dot(xmap, ymap)

# Create memory map for out of core addition result
addmap = np.memmap('addfile.dat', dtype='float32', mode='w+', shape=x.shape)

# Due out of core addition and write data
addmap[:] = xmap + ymap

# Create memory map for out of core scaling result
scalemap = np.memmap('scalefile.dat', dtype='float32', mode='w+', shape=x.shape)

# Define scaling constant
scale = 1.3

# Do out of core  scaling and write data
scalemap[:] = scale * xmap

This code will create files xfile.dat, yfile.dat, ect that contain the arrays in binary format. To access them later you simply need to do np.memmap(filename). Other arguments to np.memmap are optional, but reccomended (arguments like dtype, shape, ect.).

like image 93
spencerlyon2 Avatar answered Sep 23 '22 23:09

spencerlyon2