I was trying to make another point and accidentally saved a dict using numpy np.save(). To my surprise there seem to be no problem at all with that approach. I tried the above with another object that it's not np.array like a list and it seem to work fine.
For example the following code, saves and loads an object using np.save() and np.load():
list_file = 'random_list.npy'
random_list = [x*2 for x in range(20)]
np.save(list_file, random_list)
# load numpy array
random_list2 = np.load(list_file)
set(random_list) == set(random_list2)
True
So, my question is:
I know there are some limitation regarding pickle which could affect the nature of object that could be handled but a lot of unclear points still exist.
Edit:
I thought that np.save() was just trying to convert the object passed as parameter to numpy array but that does not make any sense in some cases like dict.
For example a dict passed to a np.array does not seem to be functional at all:
a = {1: 0, 2: 1, 3: 2}
b = np.array(a)
type(b)
numpy.ndarray
b.shape
()
numpy.save() documents its argument as "array-like".
As per numpy: formal definition of "array_like" objects?, the underlying numpy/core/src/multiarray/ctors.c:PyArray_FromAny() accepts:
/* op is an array */
/* op is a NumPy scalar */
/* op is a Python scalar */
/* op supports the PEP 3118 buffer interface */
/* op supports the __array_struct__ or __array_interface__ interface */
/* op supplies the __array__ function. */
/* Try to treat op as a list of lists */
Specifically for dict, the execution path goes like this:
numpy/npyio.py ->
numpy/core/numeric.py:asanyarray() ->
numpy/core/src/multiarray/multiarraymodule.c:_array_fromobject() ->
numpy/core/src/multiarray/ctors.c:PyArray_CheckFromAny() ->
the aforementioned PyArray_FromAny. There:
<...>
PyArray_GetArrayParamsFromObject(op, newtype,
0, &dtype,
&ndim, dims, &arr, context)
<...>
else {
if (newtype == NULL) {
newtype = dtype; #object dtype
<...>
ret = (PyArrayObject *)PyArray_NewFromDescr(&PyArray_Type, newtype,
ndim, dims,
NULL, NULL,
flags&NPY_ARRAY_F_CONTIGUOUS, NULL);
return (PyObject *)ret;
A demo:
In [507]: np.savez('test', a = [x*2 for x in range(3)], b=dict(a=1,b=np.arange(3)))
In [510]: d = np.load('test.npz')
In [511]: d['a']
Out[511]: array([0, 2, 4])
This list was converted to an array and saved.
In [512]: d['b']
Out[512]: array({'a': 1, 'b': array([0, 1, 2])}, dtype=object)
In [513]: d['b'].shape
Out[513]: ()
In [514]: d['b'].item() # or d['b'][()]
Out[514]: {'a': 1, 'b': array([0, 1, 2])}
The dictionary was wrapped in a 0d object dtype array, and saved with pickle. The array within the dictionary was pickled with save.
np.save uses pickle where needed to handle non-array objects, and pickle uses save to handle array objects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With