Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use `tf.nn.dropout` to implement DropConnect?

I (think) that I grasp the basics of DropOut and the use of the TensorFlow API in implementing it. But the normalization that's linked to the dropout probability in tf.nn.dropout seems not to be a part of DropConnect. Is that correct? If so, does normalizing do any "harm" or can I simply apply tf.nn.dropout to my weights to implement DropConnect?

like image 530
orome Avatar asked Jun 04 '17 14:06

orome


1 Answers

Answer

Yes, you can use tf.nn.dropout to do DropConnect, just use tf.nn.dropout to wrap your weight matrix instead of your post matrix multiplication. You can then undo the weight change by multiplying by the dropout like so

dropConnect = tf.nn.dropout( m1, keep_prob ) * keep_prob

Code Example

Here is a code example that calculates the XOR function using drop connect. I've also commented out the code that does dropout that you can sub in and compare the output.

### imports
import tensorflow as tf

### constant data
x  = [[0.,0.],[1.,1.],[1.,0.],[0.,1.]]
y_ = [[1.,0.],[1.,0.],[0.,1.],[0.,1.]]

### induction

# Layer 0 = the x2 inputs
x0 = tf.constant( x  , dtype=tf.float32 )
y0 = tf.constant( y_ , dtype=tf.float32 )

keep_prob = tf.placeholder( dtype=tf.float32 )

# Layer 1 = the 2x12 hidden sigmoid
m1 = tf.Variable( tf.random_uniform( [2,12] , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
b1 = tf.Variable( tf.random_uniform( [12]   , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))


########## DROP CONNECT
# - use this to preform "DropConnect" flavor of dropout
dropConnect = tf.nn.dropout( m1, keep_prob ) * keep_prob
h1 = tf.sigmoid( tf.matmul( x0, dropConnect ) + b1 ) 

########## DROP OUT
# - uncomment this to use "regular" dropout
#h1 = tf.nn.dropout( tf.sigmoid( tf.matmul( x0,m1 ) + b1 ) , keep_prob )


# Layer 2 = the 12x2 softmax output
m2 = tf.Variable( tf.random_uniform( [12,2] , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
b2 = tf.Variable( tf.random_uniform( [2]   , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
y_out = tf.nn.softmax( tf.matmul( h1,m2 ) + b2 )


# loss : sum of the squares of y0 - y_out
loss = tf.reduce_sum( tf.square( y0 - y_out ) )

# training step : discovered learning rate of 1e-2 through experimentation
train = tf.train.AdamOptimizer(1e-2).minimize(loss)

### training
# run 5000 times using all the X and Y
# print out the loss and any other interesting info
with tf.Session() as sess:
  sess.run( tf.initialize_all_variables() )
  print "\nloss"
  for step in range(5000) :
    sess.run(train,feed_dict={keep_prob:0.5})
    if (step + 1) % 100 == 0 :
      print sess.run(loss,feed_dict={keep_prob:1.})


  results = sess.run([m1,b1,m2,b2,y_out,loss],feed_dict={keep_prob:1.})
  labels  = "m1,b1,m2,b2,y_out,loss".split(",")
  for label,result in zip(*(labels,results)) :
    print ""
    print label
    print result

print ""

Output

Both flavors are able to correctly separate the input into the correct output

y_out
[[  7.05891490e-01   2.94108540e-01]
 [  9.99605477e-01   3.94574134e-04]
 [  4.99370173e-02   9.50062990e-01]
 [  4.39682379e-02   9.56031740e-01]]

Here you can see the output from dropConnect was able to correctly classify Y as true,true,false,false.

like image 106
Anton Codes Avatar answered Sep 21 '22 06:09

Anton Codes