I tried various methods to do data compression when saving to disk some numpy arrays
.
These 1D arrays contain sampled data at a certain sampling rate (can be sound recorded with a microphone, or any other measurment with any sensor) : the data is essentially continuous (in a mathematical sense ; of course after sampling it is now discrete data).
I tried with HDF5
(h5py) :
f.create_dataset("myarray1", myarray, compression="gzip", compression_opts=9)
but this is quite slow, and the compression ratio is not the best we can expect.
I also tried with
numpy.savez_compressed()
but once again it may not be the best compression algorithm for such data (described before).
What would you choose for better compression ratio on a numpy array
, with such data ?
(I thought about things like lossless FLAC (initially designed for audio), but is there an easy way to apply such an algorithm on numpy data ?)
If you're running into memory issues because your NumPy arrays are too large, one of the basic approaches to reducing memory usage is compression. By changing how you represent your data, you can reduce memory usage and shrink your array's footprint—often without changing the bulk of your code.
Save several arrays into a single file in compressed . npz format. Provide arrays as keyword arguments to store them under the corresponding name in the output file: savez(fn, x=x, y=y) . If arrays are specified as positional arguments, i.e., savez(fn, x, y) , their names will be arr_0, arr_1, etc.
NumPy Zip With the list(zip()) Function. If we have two 1D arrays and want to zip them together inside a 2D array, we can use the list(zip()) function in Python. This approach involves zipping the arrays together inside a list. The list(zip(a,b)) function takes the arrays a and b as an argument and returns a list.
You can concatenate two or more 1d arrays using the vstack and hstack methods. concatenate() is more efficient than these methods.
What I do now:
import gzip import numpy f = gzip.GzipFile("my_array.npy.gz", "w") numpy.save(file=f, arr=my_array) f.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With