Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a method in Pytorch to count the number of unique values in a way that can be back propagated?

Given the following tensor (which is the result of a network [note the grad_fn]):

tensor([121., 241., 125.,   1., 108., 238., 125., 121.,  13., 117., 121., 229.,
        161.,  13.,   0., 202., 161., 121., 121.,   0., 121., 121., 242., 125.],
       grad_fn=<MvBackward>)

Which we will define as:

xx = torch.tensor([121., 241., 125.,   1., 108., 238., 125., 121.,  13., 117., 121., 229.,
        161.,  13.,   0., 202., 161., 121., 121.,   0., 121., 121., 242., 125.]).requires_grad_(True)

I would like to define an operation which counts the number of occurrences of each value in such a way that the operation will output the following tensor:

tensor([2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
        0, 7, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
        0, 1, 1])

i.e. there are 2 zeros, 1 one, 2 thirteens, etc... the total number of possible values is set upstream, but in this example is 243

So far I have tried the following approaches, which successfully produce the desired tensor, but do not do so in a way that allows computing gradients back through the network:

Attempt 1

tt = []
for i in range(243):
    tt.append((xx == i).unsqueeze(0))
torch.cat(tt,dim=0).sum(dim=1)

Attempt 2

tvtensor = torch.tensor([i for i in range(243)]).unsqueeze(1).repeat(1,xx.shape[0]).float().requires_grad_(True)
(xx==tvtensor).sum(dim=1)

EDIT: Added Attempt

Attempt 3

-- Didn't really expect this to back prop, but figured I would give it a try anyway

ll = torch.zeros((1,243))
for x in xx:
    ll[0,x.long()] += 1

Any help is appreciated

EDIT: As requested the end goal of this is the following:

I am using a technique for calculating structural similarity between two time sequences. One is real and the other is generated. The technique is outlined in this paper (https://link.springer.com/chapter/10.1007/978-3-642-02279-1_33) where a time series is converted to a sequence of code words and the distribution of code words (similar to the way that Bag of Words is used in NLP) is used to represent the time series. Two series are considered similar when the two signal distributions are similar. This is what the counting statistics tensor is for.

What is desired is to be able to construct a loss function which consumes this tensor and measures the distance between the real and generated signal (euclidian norm on the time domain data directly does not work well and this approach claimed better results), so that it can update the generator appropriately.

like image 967
jwelch1324 Avatar asked Oct 28 '19 21:10

jwelch1324


People also ask

What does backward function do in PyTorch?

What does backward() do in PyTorch? The backward() method is used to compute the gradient during the backward pass in a neural network. The gradients are computed when this method is executed. These gradients are stored in the respective variables.

What is Requires_grad in PyTorch?

PyTorchServer Side ProgrammingProgramming. To create a tensor with gradients, we use an extra parameter "requires_grad = True" while creating a tensor. requires_grad is a flag that controls whether a tensor requires a gradient or not. Only floating point and complex dtype tensors can require gradients.

How does gradient work in PyTorch?

PyTorch computes the gradient of a function with respect to the inputs by using automatic differentiation. Automatic differentiation is a technique that, given a computational graph, calculates the gradients of the inputs. Automatic differentiation can be performed in two different ways; forward and reverse mode.

What is a leaf tensor in PyTorch?

When a tensor is first created, it becomes a leaf node. Basically, all inputs and weights of a neural network are leaf nodes of the computational graph. When any operation is performed on a tensor, it is not a leaf node anymore.

Is there an equivalent of GPU count in PyTorch?

There's no direct equivalent for the gpu count method but you can get the number of threads which are available for computation in pytorch by using Thanks for contributing an answer to Stack Overflow! Please be sure to answer the question. Provide details and share your research! But avoid …

How to count occurrences of each unique value of an array?

#count occurrences of each unique value np.unique(my_array, return_counts=True) (array ( [1, 3, 4, 7, 8]), array ( [1, 2, 2, 1, 2])) The first array in the output shows the unique values and the second array shows the count of each unique value. We can use the following code to print this output in a format that is easier to read:

How to count unique values in list in Python?

1. Python Count Unique Values in List by normal Brute Force method 2. By using Counter 3. Python Count Unique Values In List By using the Set 4. By using numpy.unique 5. Python Count Unique Values In List Using pandas dict + zip function 6. Using Pandas Dataframe 7.

How many unique values are there in the NumPy array?

From the output we can see each of the unique values in the NumPy array: 1, 3, 4, 7, 8. The following code shows how to count the total number of unique values in the NumPy array: #display total number of unique values len(np.unique(my_array)) 5 From the output we can see there are 5 unique values in the NumPy array.


2 Answers

I would do it with unique method (only to count occurrences):

enter image description here

if you want to count the occurrences, you have to add the parameter return_counts=True

enter image description here

I did it in the version 1.3.1

enter image description here

This is the fast way to count occurrences, however is a non-differentiable operation, therefore, this method is not recommendable (anyway I have described the way to count ocurrences). To perform what you want, I think you should turn the input into a distribution by means of a differentiable function (softmax is the most used) and then, use some way to measure the distance between distributions (output and target) like cross-entropy, KL (kullback-liebler), JS or wasserstein.

like image 113
Julio CamPlaz Avatar answered Sep 24 '22 00:09

Julio CamPlaz


You will not be able to do that as unique is simply non-differentiable operation.

Furthermore, only floating point tensors can have gradient as it's defined only for real numbers domain, not for integers.

Still, there might be another, differentiable way to do what you want to achieve, but that's a different question.

like image 40
Szymon Maszke Avatar answered Sep 21 '22 00:09

Szymon Maszke