Can you reverse a PyTorch neural network and activate the inputs from the outputs?

Tags:

Can we activate the outputs of a NN to gain insight into how the neurons are connected to input features?

If I take a basic NN example from the PyTorch tutorials. Here is an example of a f(x,y) training example.

import torch

N, D_in, H, D_out = 64, 1000, 100, 10

x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)

loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
for t in range(500):
    y_pred = model(x)
    loss = loss_fn(y_pred, y)
    model.zero_grad()
    loss.backward()
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

After I've finished training the network to predict y from x inputs. Is it possible to reverse the trained NN so that it can now predict x from y inputs?

I don't expect y to match the original inputs that trained the y outputs. So I expect to see what features the model activates on to match x and y.

If it is possible, then how do I rearrange the Sequential model without breaking all the weights and connections?

948

asked Jan 23 '20 12:01

Reactgular

2 Answers

It is possible but only for very special cases. For a feed-forward network (Sequential) each of the layers needs to be reversible; that means the following arguments apply to each layer separately. The transformation associated with one layer is y = activation(W*x + b) where W is the weight matrix and b the bias vector. In order to solve for x we need to perform the following steps:

Reverse activation; not all activation functions have an inverse though. For example the ReLU function does not have an inverse on (-inf, 0). If we used tanh on the other hand we can use its inverse which is 0.5 * log((1 + x) / (1 - x)).
Solve W*x = inverse_activation(y) - b for x; for a unique solution to exist W must have similar row and column rank and det(W) must be non-zero. We can control the former by choosing a specific network architecture while the latter depends on the training process.

So for a neural network to be reversible it must have a very specific architecture: all layers must have the same number of input and output neurons (i.e. square weight matrices) and the activation functions all need to be invertible.

Code: Using PyTorch we will have to do the inversion of the network manually, both in terms of solving the system of linear equations as well as finding the inverse activation function. Consider the following example of a 1-layer neural network (since the steps apply to each layer separately extending this to more than 1 layer is trivial):

import torch

N = 10  # number of samples
n = 3   # number of neurons per layer

x = torch.randn(N, n)

model = torch.nn.Sequential(
    torch.nn.Linear(n, n), torch.nn.Tanh()
)

y = model(x)

z = y  # use 'z' for the reverse result, start with the model's output 'y'.
for step in list(model.children())[::-1]:
    if isinstance(step, torch.nn.Linear):
        z = z - step.bias[None, ...]
        z = z[..., None]  # 'torch.solve' requires N column vectors (i.e. shape (N, n, 1)).
        z = torch.solve(z, step.weight)[0]
        z = torch.squeeze(z)  # remove the extra dimension that we've added for 'torch.solve'.
    elif isinstance(step, torch.nn.Tanh):
        z = 0.5 * torch.log((1 + z) / (1 - z))

print('Agreement between x and z: ', torch.dist(x, z))

122

answered Oct 02 '22 16:10

a_guest

If I've understood correctly, there are two questions here:

Is it possible to determine what features in the input have activated neurons?
If so, is it possible to use this information to generate samples from p(x|y)?

Regarding 1, a basic way to determine if a neuron is sensitive to an input feature x_i is to compute the gradient of this neuron's output w.r.t x_i. A high gradient will indicate sensitivity to a particular input element. There is a rich literature on the subject, for example, you can have a look at guided backpropagation or at GradCam (the latter is about classification with convnets, but it does contain useful ideas).

As for 2, I don't think that your approach to "reversing the problem" is correct. The problem is that your network is discriminative and what it outputs can be seen as argmax_y p(y|x). Note that this is a point-wise estimation, not a full modeling of the distribution. However, the inverse problem that you're interested in seems to be sampling from

p(x|y)=constant*p(y|x)p(x).

You don't know how to sample from p(y|x) and you don't know anything about p(x). Even if you use a method to discover correlations between the neurons and specific input features, you have only discovered which features where more important to the networks prediction, but depending on the nature of y this might be insufficiant. Consider a toy example where your inputs x are 2d points distributed according to some distribution in R^2 and where the output y is binary, such that any (a,b) in R^2 is classified as 1 if a<1 and it is classified as 0 if a>1. Then a discriminative network could learn the vertical line x=1 as its decision boundary. Inspecting correlations between neurons and input features will reveal that only the first coordinate was useful in this prediction, but this information is not sufficient for sampling from the full 2d distribution of inputs.

I think that Variational autoencoders could be what you're looking for.

answered Oct 02 '22 16:10

Ash

Related questions
                            
                                How to build python project based on pyproject.toml
                            
                                Unable to save TensorFlow Keras LSTM model to SavedModel format
                            
                                How to specify max vcores to be allocated to a query in hive?
                            
                                python list + empty numpy array = empty numpy array?
                            
                                with the Twitter API - how can I get authentication for the engagement endpoint using a bearer token
                            
                                Python bar plot with irregular spacing
                            
                                How to pair (x,y) pairs using numpy
                            
                                Mean Square Displacement as a Function of Time in Python
                            
                                Qt: Session management error: None of the authentication protocols specified are supported. When using Python sockets on Linux
                            
                                How to exclude multiple values of column using Django ORM?
                            
                                python: obtaining the OS's argv[0], not sys.argv[0]
                            
                                Create an excel file from BytesIO using python
                            
                                CommandError: 'learning_log's not a valid project name. Please make sure the name is a valid identifier
                            
                                execute pytest using pipeline in Jenkins
                            
                                How to ship requirements.txt to users without development-packages such as PyLint etc.?
                            
                                sklearn_extra installation issue
                            
                                Failed to load the native TensorFlow runtime - TensorFlow 2.1
                            
                                Removing SEP token in Bert for text classification
                            
                                How to run background tasks in python
                            
                                How can I build an LSTM AutoEncoder with PyTorch?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can you reverse a PyTorch neural network and activate the inputs from the outputs?

Tags:

python

neural-network

pytorch

Reactgular

People also ask

2 Answers

a_guest

Ash

Recent Activity

Donate For Us