I have one simple 3D array <code>a1</code>, and its masked analog <code>a2</code>: <pre class="prettyprint"><code>import numpy a1 = numpy.array([[[ 0.00, 0.00, 0.00], [ 0.88, 0.80, 0.78], [ 0.75, 0.78, 0.77]], [[ 0.00, 0.00, 0.00], [ 3.29, 3.29, 3.30], [ 3.27, 3.27, 3.26]], [[ 0.00, 0.00, 0.00], [ 0.41, 0.42, 0.40], [ 0.42, 0.43, 0.41]]]) a2 = numpy.ma.masked_equal(a1, 0.) </code></pre> I want to perform the mean of this array along several axes at a time (this is a peculiar, undocumented use of <code>axis</code> argument in <code>numpy.mean</code>, see e.g. here for an example): <pre class="prettyprint"><code>numpy.mean(a1, axis=(0, 1)) </code></pre> This is working fine with <code>a1</code>, but I get the following error with the masked array <code>a2</code>: <pre class="prettyprint"><code>TypeError: tuple indices must be integers, not tuple </code></pre> And I get the same error with the masked version <code>numpy.ma.mean(a2, axis=(0, 1))</code>, or if I unmask the array through <code>a2[a2.mask]=0</code>. I am using a tuple for the <code>axis</code> argument in <code>numpy.mean</code> as it is actually not hardcoded (this command is applied on arrays with potenially different number of dimensions, according to which the tuple is adapted). Problem encountered with <code>numpy</code> version <code>1.9.1</code> and <code>1.9.2</code>.

For a <code>MaskedArray</code> argument, <code>numpy.mean</code> calls <code>MaskedArray.mean</code>, which doesn't support a tuple <code>axis</code> argument. You can get the correct behavior by reimplementing <code>MaskedArray.mean</code> in terms of operations that do support tuples for <code>axis</code>: <pre class="prettyprint"><code>def mean(a, axis=None): if a.mask is numpy.ma.nomask: return super(numpy.ma.MaskedArray, a).mean(axis=axis) counts = numpy.logical_not(a.mask).sum(axis=axis) if counts.shape: sums = a.filled(0).sum(axis=axis) mask = (counts == 0) return numpy.ma.MaskedArray(data=sums * 1. / counts, mask=mask, copy=False) elif counts: # Return scalar, not array return a.filled(0).sum(axis=axis) * 1. / counts else: # Masked scalar return numpy.ma.masked </code></pre> or, if you're willing to rely on <code>MaskedArray.sum</code> working with a tuple <code>axis</code> (which you likely are, given that you're using undocumented behavior of <code>numpy.mean</code>), <pre class="prettyprint"><code>def mean(a, axis=None): if a.mask is numpy.ma.nomask: return super(numpy.ma.MaskedArray, a).mean(axis=axis) sums = a2.sum(axis=axis) counts = numpy.logical_not(a.mask).sum(axis=axis) result = sums * 1. / counts </code></pre> where we're relying on <code>MaskedArray.sum</code> to handle the mask. I have only lightly tested these functions; before using them, make sure they actually work, and write some tests. For example, if the output is 0-dimensional and there are no masked values, whether the output is a 0D MaskedArray or a scalar depends on whether the input mask is <code>nomask</code> or an array of all False. This is the same as the default <code>MaskedArray.mean</code> behavior, but it may not be what you want; I suspect the default behavior is a bug.

`numpy.mean` used with a tuple as `axis` argument: not working with a masked array

Tags:

python

numpy

I have one simple 3D array a1, and its masked analog a2:

import numpy

a1 = numpy.array([[[ 0.00,  0.00,  0.00],
                   [ 0.88,  0.80,  0.78],
                   [ 0.75,  0.78,  0.77]],

                  [[ 0.00,  0.00,  0.00],
                   [ 3.29,  3.29,  3.30],
                   [ 3.27,  3.27,  3.26]],

                  [[ 0.00,  0.00,  0.00],
                   [ 0.41,  0.42,  0.40],
                   [ 0.42,  0.43,  0.41]]])


a2 = numpy.ma.masked_equal(a1, 0.)

I want to perform the mean of this array along several axes at a time (this is a peculiar, undocumented use of axis argument in numpy.mean, see e.g. here for an example):

numpy.mean(a1, axis=(0, 1))

This is working fine with a1, but I get the following error with the masked array a2:

TypeError: tuple indices must be integers, not tuple

And I get the same error with the masked version numpy.ma.mean(a2, axis=(0, 1)), or if I unmask the array through a2[a2.mask]=0.

I am using a tuple for the axis argument in numpy.mean as it is actually not hardcoded (this command is applied on arrays with potenially different number of dimensions, according to which the tuple is adapted).

Problem encountered with numpy version 1.9.1 and 1.9.2.

558

asked May 13 '15 08:05

ztl

1 Answers

For a MaskedArray argument, numpy.mean calls MaskedArray.mean, which doesn't support a tuple axis argument. You can get the correct behavior by reimplementing MaskedArray.mean in terms of operations that do support tuples for axis:

def mean(a, axis=None):
    if a.mask is numpy.ma.nomask:
        return super(numpy.ma.MaskedArray, a).mean(axis=axis)

    counts = numpy.logical_not(a.mask).sum(axis=axis)
    if counts.shape:
        sums = a.filled(0).sum(axis=axis)
        mask = (counts == 0)
        return numpy.ma.MaskedArray(data=sums * 1. / counts, mask=mask, copy=False)
    elif counts:
        # Return scalar, not array
        return a.filled(0).sum(axis=axis) * 1. / counts
    else:
        # Masked scalar
        return numpy.ma.masked

or, if you're willing to rely on MaskedArray.sum working with a tuple axis (which you likely are, given that you're using undocumented behavior of numpy.mean),

def mean(a, axis=None):
    if a.mask is numpy.ma.nomask:
        return super(numpy.ma.MaskedArray, a).mean(axis=axis)

    sums = a2.sum(axis=axis)
    counts = numpy.logical_not(a.mask).sum(axis=axis)
    result = sums * 1. / counts

where we're relying on MaskedArray.sum to handle the mask.

I have only lightly tested these functions; before using them, make sure they actually work, and write some tests. For example, if the output is 0-dimensional and there are no masked values, whether the output is a 0D MaskedArray or a scalar depends on whether the input mask is nomask or an array of all False. This is the same as the default MaskedArray.mean behavior, but it may not be what you want; I suspect the default behavior is a bug.

answered Sep 30 '22 18:09

user2357112 supports Monica

Related questions
                            
                                Need some vim advice on switching to python3
                            
                                IOError: close() called during concurrent operation on the same file object
                            
                                How to reuse the same tests to test different implementations?
                            
                                Different python frozensets with same hash value
                            
                                Python get value from input element with lxml xpath
                            
                                Python: converting Trip duration of h min sec and leave only minute count
                            
                                How to divide a tuple into two in pythonic way
                            
                                1 == 0 in (0,1) is False; why? [duplicate]
                            
                                Source command not working through Java
                            
                                How to check whether the system is FreeBSD in a python script?
                            
                                kivy language cumbersomeness and rationale behind it
                            
                                Create automated tests for interactive shell based on Python's cmd module
                            
                                networkx edge-to-node node-to-edge representation
                            
                                How to use swig with compiled dll and header file only
                            
                                Most efficient way to implement numpy.in1d for muliple arrays
                            
                                How to take n-th order discrete sum of numpy array (sum equivalent of numpy.diff)
                            
                                Saving a variable in a text file
                            
                                knitr - Python engine cache option not working
                            
                                Is there a numpy function for generating sequences similar to R's seq function?
                            
                                Is there a way to return a custom value for min and max in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With