Implementing Adversarial Training in TensorFlow

Question

I would like to implement the following cost function for my neural network: $\widetilde J( heta,x,t) = J( heta,x,t) + J( heta,x+\epsilon abla J( heta, x, t))$

This makes use of adversarial inputs for the neural network to improve generalization [ref].

Specifically, I am having trouble with the $x+\epsilon\nabla J(\theta, x, t)$ part. In my TensorFlow graph, I have defined $J(\theta, x, t)$ as an operation. How can I feed $J$ with an argument other than $x$ ?

The only way I have found to do this so far is to define a parallel network $J'( heta, x + \Delta, t)$ that shares the weights with my original network and passing it $x + \Delta$ in its feed_dict argument. If possible, I would like to avoid having to redefine my entire network. How can I do this?

My TensorFlow model is written as:

x = tf.placeholder(tf.float32, [None, 32, 32]);
... # A simple neural network
y = tf.add(tf.matmul(h, W1), b1);
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, t));

Possibly relevant:

tf.stop_gradient(input, name=None)

Stops gradient computation.

...lots more stuff...

Adversarial training, where no backprop should happen through the adversarial example generation process.

https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#stop_gradient

Ian Goodfellow · Accepted Answer

You need to write your model in such a way that it supports calls like

output = model.fprop(input_tensor)

or

output = model.fprop(input_tensor, params)

The fprop method builds the same forward propagation expression twice, but with a different input tensor on each call:

raw_output = model.fprop(clean_examples)
adv_examples = ...
adv_output = model.fprop(adv_examples)

If you want to apply this to one of our open source models and it doesn't support the interface to do this, file an issue on github.

Implementing Adversarial Training in TensorFlow

Tags:

neural-network

tensorflow

tf.stop_gradient(input, name=None)

Shadowen

1 Answers

Ian Goodfellow

Recent Activity

Donate For Us

Implementing Adversarial Training in TensorFlow

Tags:

neural-network

tensorflow

tf.stop_gradient(input, name=None)

Shadowen

1 Answers

Ian Goodfellow

Related questions

Recent Activity

Donate For Us