I would like to re-create the Xavier initialization in NumPy (using basic functions) in the same way that TensorFlow2 does for CNN. Here is how I learned to do Xavier initialization in NumPy:
# weights.shape = (2,2)
np.random.seed(0)
nodes_in = 2*2
weights = np.random.rand(2,2) * np.sqrt(1/nodes_in)
>>>array([[0.27440675, 0.35759468],
[0.30138169, 0.27244159]])
This is the way I learned Xavier initialization for the logistic regression model. It seems that for Convolution Neural Network it should be different but I don't know how.
initializer = tf.initializers.GlorotUniform(seed=0)
tf.Variable(initializer(shape=[2,2],dtype=tf.float32))
>>><tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[-0.7078647 , 0.50461936],
[ 0.73500216, 0.6633029 ]], dtype=float32)>
I'm confused by the TensorFlow documentation when they explain the "fan_in" and "fan_out". I'm guessing this is where the problem is. Can somebody dumb it down for me, please?
Much appreciate it!
[UPDATE]:
When I follow the tf.keras.initializers.GlorotUniform documentation I still don't come to the same results:
# weights.shape = (2,2)
np.random.seed(0)
fan_in = 2*2
fan_out = 2*2
limit = np.sqrt(6/(fan_in + fan_out))
np.random.uniform(-limit,limit,size=(2,2))
>>>array([[0.08454747, 0.37271892],
[0.17799139, 0.07773995]])
Using Tensorflow
initializer = tf.initializers.GlorotUniform(seed=0)
tf.Variable(initializer(shape=[2,2],dtype=tf.float32))
<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[-0.7078647 , 0.50461936],
[ 0.73500216, 0.6633029 ]], dtype=float32)>
Same logic in Numpy
import math
np.random.seed(0)
scale = 1/max(1., (2+2)/2.)
limit = math.sqrt(3.0 * scale)
weights = np.random.uniform(-limit, limit, size=(2,2))
print(weights)
array([[0.11956818, 0.52710415],
[0.25171784, 0.1099409 ]])
If you observe, the above two are not the same because of random number generators. Internally tensorflow uses the stateless random generator as below and if you observe, we got the same output.
tf.random.stateless_uniform(shape=(2,2),seed=[0, 0], minval=-limit, maxval=limit)
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-0.7078647 , 0.50461936],
[ 0.73500216, 0.6633029 ]], dtype=float32)>
If you need to know more about internal implementation, you can check https://github.com/tensorflow/tensorflow/blob/2b96f3662bd776e277f86997659e61046b56c315/tensorflow/python/ops/init_ops_v2.py#L525
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With