I am using numpy.loadtext
to generate a structured Numpy array from a CSV data file that I would like to save to a MAT file for colleagues who are more familiar with MATLAB than Python.
Sample case:
import numpy as np
import scipy.io
mydata = np.array([(1, 1.0), (2, 2.0)], dtype=[('foo', 'i'), ('bar', 'f')])
scipy.io.savemat('test.mat', mydata)
When I attempt to use scipy.io.savemat
on this array, the following error is thrown:
Traceback (most recent call last):
File "C:/Project Data/General Python/test.py", line 6, in <module>
scipy.io.savemat('test.mat', mydata)
File "C:\python35\lib\site-packages\scipy\io\matlab\mio.py", line 210, in savemat
MW.put_variables(mdict)
File "C:\python35\lib\site-packages\scipy\io\matlab\mio5.py", line 831, in put_variables
for name, var in mdict.items():
AttributeError: 'numpy.ndarray' object has no attribute 'items'
I'm a Python novice (at best), but I'm assuming this is because savemat
is set up to handle dicts and the structure of Numpy's structured arrays is not compatible.
I can get around this error by pulling my data into a dict:
tmp = {}
for varname in mydata.dtype.names:
tmp[varname] = mydata[varname]
scipy.io.savemat('test.mat', tmp)
Which loads into MATLAB fine:
>> mydata = load('test.mat')
mydata =
foo: [1 2]
bar: [1 2]
But this seems like a very inefficient method since I'm duplicating the data in memory. Is there a smarter way to accomplish this?
You can do scipy.io.savemat('test.mat', {'mydata': mydata})
.
This creates a struct mydata
with fields foo
and bar
in the file.
Alternatively, you can pack your loop in a dict comprehension:
tmp = {varname: mydata[varname] for varname in mydata.dtype.names}
I don't think creating a temprorary dictionary duplicates data in memory, because Python generally only stores references, and numpy in particular tries to create views into the original data whenever possible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With