In the Breaking Linear Classifiers on ImageNet, the author proposes the following way to create adversarial images that fool ConvNets:
In short, to create a fooling image we start from whatever image we want (an actual image, or even a noise pattern), and then use backpropagation to compute the gradient of the image pixels on any class score, and nudge it along. We may, but do not have to, repeat the process a few times. You can interpret backpropagation in this setting as using dynamic programming to compute the most damaging local perturbation to the input. Note that this process is very efficient and takes negligible time if you have access to the parameters of the ConvNet (backprop is fast), but it is possible to do this even if you do not have access to the parameters but only to the class scores at the end. In this case, it is possible to compute the data gradient numerically, or to to use other local stochastic search strategies, etc. Note that due to the latter approach, even non-differentiable classifiers (e.g. Random Forests) are not safe (but I haven’t seen anyone empirically confirm this yet).
I know I can calculate the gradient of an image like this:
np.gradient(img)
But how do I compute the gradient of an image relative to another image class using TensorFlow or Numpy? Probably I need to do something similar to the process in this tutorial? Such as:
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.initialize_all_variables())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print("test accuracy %g"%accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
But I'm not sure exactly how...Specifically, I have an image of digit 2 as below:
array([[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.14117648, 0.49019611, 0.74901962,
0.85490203, 1. , 0.99607849, 0.99607849, 0.9450981 ,
0.20000002, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0.80000007, 0.97647065, 0.99215692, 0.99215692,
0.99215692, 0.99215692, 0.99215692, 0.99215692, 0.99215692,
0.98039222, 0.92156869, 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.34509805,
0.9450981 , 0.98431379, 0.99215692, 0.88235301, 0.55686277,
0.19215688, 0.04705883, 0.04705883, 0.04705883, 0.41176474,
0.99215692, 0.99215692, 0.43529415, 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0.37254903, 0.88235301,
0.99215692, 0.65490198, 0.44313729, 0.05490196, 0. ,
0. , 0. , 0. , 0. , 0.0627451 ,
0.82745105, 0.99215692, 0.45882356, 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.35686275, 0.9333334 , 0.99215692,
0.66666669, 0.10980393, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.58823532, 0.99215692, 0.45882356, 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0.38431376, 0.98431379, 0.85490203, 0.18823531,
0.01960784, 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.58823532, 0.99215692, 0.45882356, 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0.43921572, 0.99215692, 0.43921572, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.03529412,
0.72156864, 0.94901967, 0.07058824, 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0.07843138, 0.17647059, 0.01960784, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.26274511,
0.99215692, 0.94117653, 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0.10588236, 0.91764712,
0.97254908, 0.41176474, 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.17254902, 0.6156863 , 0.99215692,
0.51764709, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0.04313726, 0.74117649, 0.99215692, 0.7960785 ,
0.10588236, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.04313726, 0.61176473, 0.99215692, 0.96470594, 0.3019608 ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.04313726,
0.61176473, 0.99215692, 0.79215693, 0.26666668, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0.04313726, 0.61176473,
0.99215692, 0.88627458, 0.27843139, 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0.11764707, 0.12941177,
0.12941177, 0.54901963, 0.63921571, 0.72941178, 0.99215692,
0.88627458, 0.14901961, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0.04705883, 0.31764707, 0.95686281, 0.99215692,
0.99215692, 0.99215692, 0.99215692, 0.99215692, 0.99215692,
0.99215692, 0.72941178, 0.27450982, 0.09019608, 0. ,
0. , 0.08627451, 0.61176473, 0.3019608 , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0.3137255 , 0.76470596, 0.99215692, 0.99215692, 0.99215692,
0.99215692, 0.99215692, 0.97254908, 0.91764712, 0.65098041,
0.97254908, 0.99215692, 0.99215692, 0.94117653, 0.58823532,
0.28627452, 0.56470591, 0.40784317, 0.20000002, 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0.02745098,
0.97254908, 0.99215692, 0.99215692, 0.99215692, 0.99215692,
0.99215692, 0.94901967, 0.41176474, 0. , 0. ,
0.41960788, 0.94901967, 0.99215692, 0.99215692, 0.99215692,
0.96078438, 0.627451 , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0.22352943,
0.98039222, 0.99215692, 0.99215692, 0.99215692, 0.96862751,
0.52941179, 0.08235294, 0. , 0. , 0. ,
0. , 0.08235294, 0.45882356, 0.71764708, 0.71764708,
0.18823531, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0.47450984, 0.48235297, 0.6901961 , 0.52941179, 0.0627451 ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ]], dtype=float32)
How do I compute the gradient of this image relative to the the digit 6 image class (with an example shown below)? (I guess I need to compute the gradient for all digit 6 images using back propagation.)
array([[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.19215688, 0.70588237, 0.99215692,
0.95686281, 0.19607845, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.72156864, 0.98823535, 0.98823535,
0.90980399, 0.64313728, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0.25882354, 0.91764712, 0.98823535, 0.53333336,
0.14901961, 0.21960786, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.07450981, 0.92549026, 0.98823535, 0.6901961 , 0.01568628,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.29803923, 0.98823535, 0.98823535, 0.21960786, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.54509807, 0.99215692, 0.67843139, 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.08627451,
0.83137262, 0.98823535, 0.27058825, 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.45490199,
0.99215692, 0.94117653, 0.19607845, 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.6156863 ,
0.99215692, 0.80784321, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.90196085,
0.99215692, 0.40000004, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.90588242,
1. , 0.70588237, 0.5411765 , 0.70588237, 0.99215692,
1. , 0.99215692, 0.8705883 , 0.38039219, 0.01176471,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.90196085,
0.99215692, 0.98823535, 0.98823535, 0.98823535, 0.98823535,
0.82745105, 0.98823535, 0.98823535, 0.98823535, 0.45882356,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.90196085,
0.99215692, 0.94117653, 0.71764708, 0.34901962, 0.27058825,
0.02745098, 0.27058825, 0.67058825, 0.98823535, 0.98823535,
0.33333334, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.52941179,
0.99215692, 0.60000002, 0. , 0. , 0. ,
0. , 0. , 0.0509804 , 0.84313732, 0.98823535,
0.45490199, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.45490199,
0.99215692, 0.80784321, 0. , 0. , 0. ,
0. , 0. , 0. , 0.60784316, 0.98823535,
0.45490199, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0.41568631,
1. , 0.82745105, 0.02745098, 0. , 0. ,
0. , 0. , 0.19215688, 0.91372555, 0.99215692,
0.45490199, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.62352943, 0.98823535, 0.60392159, 0.03529412, 0. ,
0. , 0.11764707, 0.77254909, 0.98823535, 0.98823535,
0.37254903, 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.06666667, 0.89019614, 0.98823535, 0.60392159, 0.27450982,
0.31764707, 0.89411771, 0.98823535, 0.89019614, 0.50980395,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0.19607845, 0.89019614, 0.98823535, 0.98823535,
0.99215692, 0.98823535, 0.72549021, 0.19607845, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.18823531, 0.7019608 , 0.98823535,
0.74509805, 0.45882356, 0.02352941, 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ]], dtype=float32)
Thanks in advance for any help!
Here're two related questions that I asked:
How to use image and weight matrix to create adversarial images in TensorFlow?
How to create adversarial images for ConvNet?
And here's my script.
Mathematical Calculation of Image gradients We will subtract the pixels opposite to each other i.e. Pbottom – Ptop and Pright – Pleft , which will give us the change in intensity or the contrast in the level of intensity of the opposite the pixel.
[ Gmag , Gdir ] = imgradient( I , method ) returns the gradient magnitude and direction using the specified method . [ Gmag , Gdir ] = imgradient( Gx , Gy ) returns the gradient magnitude and direction from the directional gradients Gx and Gy in the x and y directions, respectively.
We can find the gradient of an image by the help of Sobel and Laplacian derivatives of the image. Sobel is used for either X or Y direction or even in combined form while Laplacian help in both directions. Don't worry about the mathematical calculation of the image.
Gradient tapes TensorFlow "records" relevant operations executed inside the context of a tf. GradientTape onto a "tape". TensorFlow then uses that tape to compute the gradients of a "recorded" computation using reverse mode differentiation.
If you only have access to class scores for any image you suggest there's not much fancy you can do to truly compute a gradient.
If what is returned can be seen as a relative score for each category it is a vector v
that is the result of some function f
acting on a vector A
that contains all the information on the image*. The true gradient of the function is given by the matrix D(A)
, which depends on A
, such that D(A)*B = (f(A + epsilon*B) -f(A))/epsilon
in the limit of small epsilon
for any B
. You could approximate this numerically using some small value for epsilon and a number of test matrices B
(one for each element of A
should be enough), but this is likely to be needlessly expensive.
What you are trying to do is maximize the difficulty the algorithm has in recognizing the image. That is, for a given algorithm f
you want to maximize some appropriate measure for how poorly the algorithm recognizes each of your images A
. There is a plethora of methods for this. I'm not too familiar with them, but a talk I saw recently had some interesting material on this (https://wsc.project.cwi.nl/woudschoten-conferences/2016-woudschoten-conference/PRtalk1.pdf, see page 24 and onwards). Computing the whole gradient is usually way too expensive if you have high dimensional input. Instead you just modify a randomly chosen coordinate and take many (many) small, cheap steps each more or less in the right direction rather than going for somehow optimal large, but expensive steps.
If you know the model in full and it is possible to write is explicitly as v = f(A)
then you can compute the gradient of the function f
. This would be the case if the algorithm you're trying to beat is a linear regression, possibly with multiple layers. The form of the gradient should be easier for you to figure out than for me to write it down here.
With this gradient available and fairly cheap to evaluate its value for different images A
you can proceed with, for example, a steepest descent (or ascent) approach to making the image less recognizable for the algorithm.
It's probably best not to forget that your approach should not render the image illegible to humans too, that would make it all rather pointless.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With