Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to do scatter/gather operations in numpy

Tags:

python

numpy

lets say i have arrays:

a = array((1,2,3,4,5))
indices = array((1,1,1,1))

and i perform operation:

a[indices] += 1

the result is

array([1, 3, 3, 4, 5])

in other words, the duplicates in indices are ignored

if I wanted the duplicates not to be ignored, resulting in:

array([1, 6, 3, 4, 5])

how would I go about this?

the example above is somewhat trivial, what follows is exactly what I am trying to do:

def inflate(self,pressure):
    faceforces = pressure * cross(self.verts[self.faces[:,1]]-self.verts[self.faces[:,0]], self.verts[self.faces[:,2]]-self.verts[self.faces[:,0]])
    self.verts[self.faces[:,0]] += faceforces
    self.verts[self.faces[:,1]] += faceforces
    self.verts[self.faces[:,2]] += faceforces

def constrain_lengths(self):
    vectors = self.verts[self.constraints[:,1]] - self.verts[self.constraints[:,0]]
    lengths = sqrt(sum(square(vectors), axis=1))
    correction = 0.5 * (vectors.T * (1 - (self.restlengths / lengths))).T
    self.verts[self.constraints[:,0]] += correction
    self.verts[self.constraints[:,1]] -= correction

def compute_normals(self):
    facenormals = cross(self.verts[self.faces[:,1]]-self.verts[self.faces[:,0]], self.verts[self.faces[:,2]]-self.verts[self.faces[:,0]])
    self.normals.fill(0)
    self.normals[self.faces[:,0]] += facenormals
    self.normals[self.faces[:,1]] += facenormals
    self.normals[self.faces[:,2]] += facenormals
    lengths = sqrt(sum(square(self.normals), axis=1))
    self.normals = (self.normals.T / lengths).T

Ive been getting some very buggy results as a result of duplicates being ignored in my indexed assignment operations.

like image 993
damien Avatar asked Sep 29 '10 03:09

damien


2 Answers

numpy's histogram function is a scatter operation.

a += histogram(indices, bins=a.size, range=(0, a.size))[0]

You may need to take some care because if indices contains integers, small rounding errors could result in values ending up in the wrong bucket. In which case use:

a += histogram(indices, bins=a.size, range=(-0.5, a.size-0.5))[0]

to get each index into the centre of each bin.

Update: this works. But I recommend using @Eelco Hoogendoorn's answer based on numpy.add.at.

like image 150
sigfpe Avatar answered Sep 20 '22 11:09

sigfpe


Slightly late to the party, but seeing how commonly this operation is required, and the fact that it still does not seem to be a part of standard numpy, ill put my solution here for reference:

def scatter(rowidx, vals, target):
    """compute target[rowidx] += vals, allowing for repeated values in rowidx"""
    rowidx = np.ravel(rowidx)
    vals   = np.ravel(vals)
    cols   = len(vals)
    data   = np.ones(cols)
    colidx = np.arange(cols)
    rows   = len(target)
    from scipy.sparse import coo_matrix
    M = coo_matrix((data,(rowidx,colidx)), shape=(rows, cols))
    target += M*vals
def gather(idx, vals):
    """for symmetry with scatter"""
    return vals[idx]

A custom C routine in numpy could easily be twice as fast still, eliminating the superfluous allocation of and multiplication with ones, for starters, but it makes a world of difference in performance versus a loop in python.

Aside from performance considerations, it is stylistically much more in line with other numpy-vectorized code to use a scatter operation, rather than mash some for loops in your code.

Edit:

Ok, forget about the above. As of the lastest 1.8 release, doing scatter operations is now directly supported in numpy at optimal efficiency.

def scatter(idx, vals, target):
    """target[idx] += vals, but allowing for repeats in idx"""
    np.add.at(target, idx.ravel(), vals.ravel())
like image 33
Eelco Hoogendoorn Avatar answered Sep 18 '22 11:09

Eelco Hoogendoorn