Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Softmax function of a numpy array by row

I am trying to apply a softmax function to a numpy array. But I am not getting the desired results. This is the code I have tried:

 import numpy as np
 x = np.array([[1001,1002],[3,4]])
 softmax = np.exp(x - np.max(x))/(np.sum(np.exp(x - np.max(x)))
 print softmax

I think the x - np.max(x) code is not subtracting the max of each row. The max needs to be subtracted from x to prevent very large numbers.

This is supposed to output

 np.array([
    [0.26894142, 0.73105858],
    [0.26894142, 0.73105858]])

But I am getting:

np.array([
    [0.26894142, 0.73105858],
    [0, 0]])
like image 851
Pranay Aryal Avatar asked Apr 08 '17 04:04

Pranay Aryal


People also ask

Does numpy have softmax?

NumPy Softmax Function for 2D Arrays in Python The softmax function for a 2D array will perform the softmax transformation along the rows, which means the max and sum will be calculated along the rows.

What is softmax numpy?

The softmax function transforms each element of a collection by computing the exponential of each element divided by the sum of the exponentials of all the elements. That is, if x is a one-dimensional numpy array: softmax(x) = np. exp(x)/sum(np. exp(x))


4 Answers

A convenient way to keep the axes that are consumed by "reduce" operations such as max or sum is the keepdims keyword:

mx = np.max(x, axis=-1, keepdims=True)
mx
# array([[1002],
#        [   4]])
x - mx
# array([[-1,  0],
#        [-1,  0]])
numerator = np.exp(x - mx)
denominator = np.sum(numerator, axis=-1, keepdims=True)
denominator
# array([[ 1.36787944],
#        [ 1.36787944]])
numerator/denominator
# array([[ 0.26894142,  0.73105858],
         [ 0.26894142,  0.73105858]])
like image 120
Paul Panzer Avatar answered Oct 02 '22 20:10

Paul Panzer


My 5-liner (which uses scipy logsumexp for the tricky bits):

def softmax(a, axis=None):
    """
    Computes exp(a)/sumexp(a); relies on scipy logsumexp implementation.
    :param a: ndarray/tensor
    :param axis: axis to sum over; default (None) sums over everything
    """
    from scipy.special import logsumexp
    lse = logsumexp(a, axis=axis)  # this reduces along axis
    if axis is not None:
        lse = np.expand_dims(lse, axis)  # restore that axis for subtraction
    return np.exp(a - lse)

You may have to use from scipy.misc import logsumexp if you have an older scipy version.

like image 22
Yibo Yang Avatar answered Oct 02 '22 19:10

Yibo Yang


EDIT. As of version 1.2.0, scipy includes softmax as a special function:

https://scipy.github.io/devdocs/generated/scipy.special.softmax.html

I wrote a very general softmax function operating over an arbitrary axis, including the tricky max subtraction bit. The function is below, and I wrote a blog post about it here.

def softmax(X, theta = 1.0, axis = None):
    """
    Compute the softmax of each element along an axis of X.

    Parameters
    ----------
    X: ND-Array. Probably should be floats. 
    theta (optional): float parameter, used as a multiplier
        prior to exponentiation. Default = 1.0
    axis (optional): axis to compute values along. Default is the 
        first non-singleton axis.

    Returns an array the same size as X. The result will sum to 1
    along the specified axis.
    """

    # make X at least 2d
    y = np.atleast_2d(X)

    # find axis
    if axis is None:
        axis = next(j[0] for j in enumerate(y.shape) if j[1] > 1)

    # multiply y against the theta parameter, 
    y = y * float(theta)

    # subtract the max for numerical stability
    y = y - np.expand_dims(np.max(y, axis = axis), axis)

    # exponentiate y
    y = np.exp(y)

    # take the sum along the specified axis
    ax_sum = np.expand_dims(np.sum(y, axis = axis), axis)

    # finally: divide elementwise
    p = y / ax_sum

    # flatten if X was 1D
    if len(X.shape) == 1: p = p.flatten()

    return p
like image 33
Nolan Conaway Avatar answered Oct 02 '22 19:10

Nolan Conaway


The x - np.max(x) code is not doing row-wise subtraction. Let's do it step-wise. First we will make a 'maxes' array by tiling or making a copy of the column:

maxes = np.tile(np.max(x,1), (2,1)).T

This will create a 2X2 matrix which will correspond to the maxes for each row by making a duplicate column(tile). After this you can do:

 x = np.exp(x - maxes)/(np.sum(np.exp(x - maxes), axis = 1))

You should get your result with this. The axis = 1 is for the row-wise softmax you mentioned in the heading of your answer. Hope this helps.

like image 39
Pranay Aryal Avatar answered Oct 02 '22 21:10

Pranay Aryal