I have huge json objects containing 2D lists of coordinates that I need to transform into numpy arrays for processing.
However using json.loads
followed with np.array()
is too slow.
Is there a way to increase the speed of creation of numpy arrays from json?
import json
import numpy as np
json_input = '{"rings" : [[[-8081441.0, 5685214.0], [-8081446.0, 5685216.0], [-8081442.0, 5685219.0], [-8081440.0, 5685211.0], [-8081441.0, 5685214.0]]]}'
dict = json.loads(json_input)
numpy_2d_arrays = [np.array(ring) for ring in dict["rings"]]
I would take any solution whatsoever!
The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.
Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
The simplest answer would just be:
numpy_2d_arrays = np.array(dict["rings"])
As this avoids explicitly looping over your array in python you would probably see a modest speedup. If you have control over the creation of json_input
it would be better to write out as a serial array. A version is here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With