Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the numpy equivalent of TensorFlow Xavier initializer for CNN?

I would like to re-create the Xavier initialization in NumPy (using basic functions) in the same way that TensorFlow2 does for CNN. Here is how I learned to do Xavier initialization in NumPy:

# weights.shape = (2,2)
np.random.seed(0)
nodes_in = 2*2
weights = np.random.rand(2,2) * np.sqrt(1/nodes_in)

>>>array([[0.27440675, 0.35759468],
          [0.30138169, 0.27244159]])

This is the way I learned Xavier initialization for the logistic regression model. It seems that for Convolution Neural Network it should be different but I don't know how.

initializer = tf.initializers.GlorotUniform(seed=0)
tf.Variable(initializer(shape=[2,2],dtype=tf.float32))

>>><tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
   array([[-0.7078647 ,  0.50461936],
          [ 0.73500216,  0.6633029 ]], dtype=float32)>

I'm confused by the TensorFlow documentation when they explain the "fan_in" and "fan_out". I'm guessing this is where the problem is. Can somebody dumb it down for me, please?

Much appreciate it!

[UPDATE]:

When I follow the tf.keras.initializers.GlorotUniform documentation I still don't come to the same results:

# weights.shape = (2,2)
np.random.seed(0)
fan_in = 2*2
fan_out = 2*2
limit = np.sqrt(6/(fan_in + fan_out))
np.random.uniform(-limit,limit,size=(2,2))
>>>array([[0.08454747, 0.37271892],
          [0.17799139, 0.07773995]])
like image 391
Jek Denys Avatar asked Dec 31 '22 01:12

Jek Denys


1 Answers

Using Tensorflow

initializer = tf.initializers.GlorotUniform(seed=0)
tf.Variable(initializer(shape=[2,2],dtype=tf.float32))
<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[-0.7078647 ,  0.50461936],
       [ 0.73500216,  0.6633029 ]], dtype=float32)>

Same logic in Numpy

import math
np.random.seed(0)
scale = 1/max(1., (2+2)/2.)
limit = math.sqrt(3.0 * scale)
weights = np.random.uniform(-limit, limit, size=(2,2))
print(weights)
array([[0.11956818, 0.52710415],
       [0.25171784, 0.1099409 ]])

If you observe, the above two are not the same because of random number generators. Internally tensorflow uses the stateless random generator as below and if you observe, we got the same output.

tf.random.stateless_uniform(shape=(2,2),seed=[0, 0], minval=-limit, maxval=limit)
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-0.7078647 ,  0.50461936],
       [ 0.73500216,  0.6633029 ]], dtype=float32)>

If you need to know more about internal implementation, you can check https://github.com/tensorflow/tensorflow/blob/2b96f3662bd776e277f86997659e61046b56c315/tensorflow/python/ops/init_ops_v2.py#L525

like image 121
Uday Avatar answered May 16 '23 10:05

Uday