The function torch.nn.functional.softmax
takes two parameters: input
and dim
. According to its documentation, the softmax operation is applied to all slices of input
along the specified dim
, and will rescale them so that the elements lie in the range (0, 1)
and sum to 1.
Let input be:
input = torch.randn((3, 4, 5, 6))
Suppose I want the following, so that every entry in that array is 1:
sum = torch.sum(input, dim = 3) # sum's size is (3, 4, 5, 1)
How should I apply softmax?
softmax(input, dim = 0) # Way Number 0 softmax(input, dim = 1) # Way Number 1 softmax(input, dim = 2) # Way Number 2 softmax(input, dim = 3) # Way Number 3
My intuition tells me that is the last one, but I am not sure. English is not my first language and the use of the word along
seemed confusing to me because of that.
I am not very clear on what "along" means, so I will use an example that could clarify things. Suppose we have a tensor of size (s1, s2, s3, s4), and I want this to happen
softmax(x, dim=-1) The dim argument is required unless your input tensor is a vector. It specifies the axis along which to apply the softmax activation. Passing in dim=-1 applies softmax to the last dimension. So, after you do this, the elements of the last dimension will sum to 1.
The softmax function is used as the activation function in the output layer of neural network models that predict a multinomial probability distribution. That is, softmax is used as the activation function for multi-class classification problems where class membership is required on more than two class labels.
The main purpose of the softmax function is to transform the (unnormalised) output of K units (which is e.g. represented as a vector of K elements) of a fully-connected layer to a probability distribution (a normalised output), which is often represented as a vector of K elements, each of which is between 0 and 1 (a ...
Steven's answer is not correct. See the snapshot below. It is actually the reverse way.
Image transcribed as code:
>>> x = torch.tensor([[1,2],[3,4]],dtype=torch.float) >>> F.softmax(x,dim=0) tensor([[0.1192, 0.1192], [0.8808, 0.8808]]) >>> F.softmax(x,dim=1) tensor([[0.2689, 0.7311], [0.2689, 0.7311]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With