I'm working with 3D pointcloud of Lidar. The points are given by numpy array that looks like this:
points = np.array([[61651921, 416326074, 39805], [61605255, 416360555, 41124], [61664810, 416313743, 39900], [61664837, 416313749, 39910], [61674456, 416316663, 39503], [61651933, 416326074, 39802], [61679969, 416318049, 39500], [61674494, 416316677, 39508], [61651908, 416326079, 39800], [61651908, 416326087, 39802], [61664845, 416313738, 39913], [61674480, 416316668, 39503], [61679996, 416318047, 39510], [61605290, 416360572, 41118], [61605270, 416360565, 41122], [61683939, 416313004, 41052], [61683936, 416313033, 41060], [61679976, 416318044, 39509], [61605279, 416360555, 41109], [61664837, 416313739, 39915], [61674487, 416316666, 39505], [61679961, 416318035, 39503], [61683943, 416313004, 41054], [61683930, 416313042, 41059]])
I'd like to keep my data grouped into cubes of size 50*50*50
so that every cube preserves some hashable index and numpy indices of my points
it contains. In order to get splitting, I assign cubes = points \\ 50
which outputs to:
cubes = np.array([[1233038, 8326521, 796], [1232105, 8327211, 822], [1233296, 8326274, 798], [1233296, 8326274, 798], [1233489, 8326333, 790], [1233038, 8326521, 796], [1233599, 8326360, 790], [1233489, 8326333, 790], [1233038, 8326521, 796], [1233038, 8326521, 796], [1233296, 8326274, 798], [1233489, 8326333, 790], [1233599, 8326360, 790], [1232105, 8327211, 822], [1232105, 8327211, 822], [1233678, 8326260, 821], [1233678, 8326260, 821], [1233599, 8326360, 790], [1232105, 8327211, 822], [1233296, 8326274, 798], [1233489, 8326333, 790], [1233599, 8326360, 790], [1233678, 8326260, 821], [1233678, 8326260, 821]])
{(1232105, 8327211, 822): [1, 13, 14, 18]),
(1233038, 8326521, 796): [0, 5, 8, 9],
(1233296, 8326274, 798): [2, 3, 10, 19],
(1233489, 8326333, 790): [4, 7, 11, 20],
(1233599, 8326360, 790): [6, 12, 17, 21],
(1233678, 8326260, 821): [15, 16, 22, 23]}
My real pointcloud contains up to few hundreds of millions of 3D points. What is the fastest way to do this kind of grouping?
I've tried a majority of various solutions. Here is comparison of time compsumption assuming size of points is arround 20 millions and size of distinct cubes is arround 1 million:
import pandas as pd
print(pd.DataFrame(cubes).groupby([0,1,2]).indices)
#takes 9sec
#thanks @abc:
result = defaultdict(list)
for idx, elem in enumerate(cubes):
result[elem.tobytes()].append(idx) # takes 20.5sec
# result[elem[0], elem[1], elem[2]].append(idx) #takes 27sec
# result[tuple(elem)].append(idx) # takes 50sec
# thanks @Eelco Hoogendoorn for his library
values = npi.group_by(cubes).split(np.arange(len(cubes)))
result = dict(enumerate(values))
# takes 9.8sec
# thanks @Divakar for showing numexpr library:
import numexpr as ne
def dimensionality_reduction(cubes):
#cubes = cubes - np.min(cubes, axis=0) #in case some coords are negative
cubes = cubes.astype(np.int64)
s0, s1 = cubes[:,0].max()+1, cubes[:,1].max()+1
d = {'s0':s0,'s1':s1,'c0':cubes[:,0],'c1':cubes[:,1],'c2':cubes[:,2]}
c1D = ne.evaluate('c0+c1*s0+c2*s0*s1',d)
return c1D
cubes = dimensionality_reduction(cubes)
result = pd.DataFrame(cubes).groupby([0]).indices
# takes 2.5 seconds
It's possible to download cubes.npz
file here and use a command
cubes = np.load('cubes.npz')['array']
to check performance time.
It's faster to append list first and convert to array than appending NumPy arrays.
Indexing in NumPy is a reasonably fast operation.
Numpy is fast with operating on vectors (matrices), when performed on the whole structure at once. Such single element-by-element operations are slow.
We can perform dimensionality-reduction
to reduce cubes
to a 1D array. This is based on a mapping of the given cubes data onto a n-dim grid to compute the linear-index equivalents, discussed in detail here
. Then, based on the uniqueness of those linear indices, we can segregate unique groups and their corresponding indices. Hence, following those strategies, we would have one solution, like so -
N = 4 # number of indices per group
c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
sidx = c1D.argsort()
indices = sidx.reshape(-1,N)
unq_groups = cubes[indices[:,0]]
# If you need in a zipped dictionary format
out = dict(zip(map(tuple,unq_groups), indices))
Alternative #1 : If the integer values in cubes
are too large, we might want to do the dimensionality-reduction
such that the dimensions with shorter extent are choosen as the primary axes. Hence, for those cases, we can modify the reduction step to get c1D
, like so -
s1,s2 = cubes[:,:2].max(0)+1
s = np.r_[s2,1,s1*s2]
c1D = cubes.dot(s)
Next up, we can use Cython-powered kd-tree
for quick nearest-neighbor lookup to get nearest neighbouring indices and hence solve our case like so -
from scipy.spatial import cKDTree
idx = cKDTree(cubes).query(cubes, k=N)[1] # N = 4 as discussed earlier
I = idx[:,0].argsort().reshape(-1,N)[:,0]
unq_groups,indices = cubes[I],idx[I]
We will extend the argsort based method with some splitting to get our desired output, like so -
c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
sidx = c1D.argsort()
c1Ds = c1D[sidx]
split_idx = np.flatnonzero(np.r_[True,c1Ds[:-1]!=c1Ds[1:],True])
grps = cubes[sidx[split_idx[:-1]]]
indices = [sidx[i:j] for (i,j) in zip(split_idx[:-1],split_idx[1:])]
# If needed as dict o/p
out = dict(zip(map(tuple,grps), indices))
Using 1D versions of groups of cubes
as keys
We will extend earlier listed method with the groups of cubes
as keys to simplify the process of dictionary creating and also make it efficient with it, like so -
def numpy1(cubes):
c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
sidx = c1D.argsort()
c1Ds = c1D[sidx]
mask = np.r_[True,c1Ds[:-1]!=c1Ds[1:],True]
split_idx = np.flatnonzero(mask)
indices = [sidx[i:j] for (i,j) in zip(split_idx[:-1],split_idx[1:])]
out = dict(zip(c1Ds[mask[:-1]],indices))
return out
Next up, we will make use of numba
package to iterate and get to the final hashable dictionary output. Going with it, there would be two solutions - One that gets the keys and values separately using numba
and the main calling will zip and convert to dict, while the other one will create a numba-supported
dict type and hence no extra work required by the main calling function.
Thus, we would have first numba
solution :
from numba import njit
@njit
def _numba1(sidx, c1D):
out = []
n = len(sidx)
start = 0
grpID = []
for i in range(1,n):
if c1D[sidx[i]]!=c1D[sidx[i-1]]:
out.append(sidx[start:i])
grpID.append(c1D[sidx[start]])
start = i
out.append(sidx[start:])
grpID.append(c1D[sidx[start]])
return grpID,out
def numba1(cubes):
c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
sidx = c1D.argsort()
out = dict(zip(*_numba1(sidx, c1D)))
return out
And second numba
solution as :
from numba import types
from numba.typed import Dict
int_array = types.int64[:]
@njit
def _numba2(sidx, c1D):
n = len(sidx)
start = 0
outt = Dict.empty(
key_type=types.int64,
value_type=int_array,
)
for i in range(1,n):
if c1D[sidx[i]]!=c1D[sidx[i-1]]:
outt[c1D[sidx[start]]] = sidx[start:i]
start = i
outt[c1D[sidx[start]]] = sidx[start:]
return outt
def numba2(cubes):
c1D = np.ravel_multi_index(cubes.T, cubes.max(0)+1)
sidx = c1D.argsort()
out = _numba2(sidx, c1D)
return out
Timings with cubes.npz
data -
In [4]: cubes = np.load('cubes.npz')['array']
In [5]: %timeit numpy1(cubes)
...: %timeit numba1(cubes)
...: %timeit numba2(cubes)
2.38 s ± 14.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.13 s ± 25.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.8 s ± 5.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Alternative #1 : We can achieve further speedup with numexpr
for large arrays to compute c1D
, like so -
import numexpr as ne
s0,s1 = cubes[:,0].max()+1,cubes[:,1].max()+1
d = {'s0':s0,'s1':s1,'c0':cubes[:,0],'c1':cubes[:,1],'c2':cubes[:,2]}
c1D = ne.evaluate('c0+c1*s0+c2*s0*s1',d)
This would be applicable at all places that require c1D
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With