On my 64-bit Debian/Lenny system (4GByte RAM + 4GByte swap partition) I can successfully do:
v=array(10000*random([512,512,512]),dtype=np.int16)
f=fftn(v)
but with f being a np.complex128
the memory consumption is shocking, and I can't do much more with the result (e.g modulate the coefficients and then f=ifftn(f)
) without a MemoryError
traceback.
Rather than installing some more RAM and/or expanding my swap partitions, is there some way of controlling the scipy/numpy "default precision" and have it compute a complex64 array instead ?
I know I can just reduce it afterwards with f=array(f,dtype=np.complex64)
; I'm looking to have it actually do the FFT work in 32-bit precision and half the memory.
Less memory usage: The Python NumPy array consumes less memory than lists. Less execution time: The NumPy array is pretty fast in terms of execution, as compared to lists in Python.
Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.
NumPy is written in C and so has a faster computational speed. SciPy is written in Python and so has a slower execution speed but vast functionality.
It doesn't look like there's any function to do this in scipy's fft functions ( see http://www.astro.rug.nl/efidad/scipy.fftpack.basic.html ).
Unless you're able to find a fixed point FFT library for python, it's unlikely that the function you want exists, since your native hardware floating point format is 128 bits. It does look like you could use the rfft method to get just the real-valued components (no phase) of the FFT, and that would save half your RAM.
I ran the following in interactive python:
>>> from numpy import *
>>> v = array(10000*random.random([512,512,512]),dtype=int16)
>>> shape(v)
(512, 512, 512)
>>> type(v[0,0,0])
<type 'numpy.int16'>
At this point the RSS (Resident Set Size) of python was 265MB.
f = fft.fft(v)
And at this point the RSS of python 2.3GB.
>>> type(f)
<type 'numpy.ndarray'>
>>> type(f[0,0,0])
<type 'numpy.complex128'>
>>> v = []
And at this point the RSS goes down to 2.0GB, since I've free'd up v.
Using "fft.rfft(v)" to compute real-values only results in a 1.3GB RSS. (almost half, as expected)
Doing:
>>> f = complex64(fft.fft(v))
Is the worst of both worlds, since it first computes the complex128 version (2.3GB) and then copies that into the complex64 version (1.3GB) which means the peak RSS on my machine was 3.6GB, and then it settled down to 1.3GB again.
I think that if you've got 4GB RAM, this should all work just fine (as it does for me). What's the issue?
Scipy 0.8 will have single precision support for almost all the fft code (The code is already in the trunk, so you can install scipy from svn if you need the feature now).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With