Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to parse JSON strings into numpy arrays

I have huge json objects containing 2D lists of coordinates that I need to transform into numpy arrays for processing.

However using json.loads followed with np.array() is too slow.

Is there a way to increase the speed of creation of numpy arrays from json?

import json
import numpy as np

json_input = '{"rings" : [[[-8081441.0, 5685214.0], [-8081446.0, 5685216.0], [-8081442.0, 5685219.0], [-8081440.0, 5685211.0], [-8081441.0, 5685214.0]]]}'

dict = json.loads(json_input)
numpy_2d_arrays = [np.array(ring) for ring in dict["rings"]]

I would take any solution whatsoever!

like image 266
Below the Radar Avatar asked Dec 09 '16 21:12

Below the Radar


People also ask

Can NumPy array take strings?

The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.

Is a NumPy Ndarray is faster than a built in list?

Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.

Which is faster NumPy or list?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.


1 Answers

The simplest answer would just be:

numpy_2d_arrays = np.array(dict["rings"])

As this avoids explicitly looping over your array in python you would probably see a modest speedup. If you have control over the creation of json_input it would be better to write out as a serial array. A version is here.

like image 57
Daniel Avatar answered Sep 23 '22 14:09

Daniel