I can create a model using the pre-build high-level functions like FullyConnected
. For example:
X = mx.sym.Variable('data')
P = mx.sym.FullyConnected(data = X, name = 'fc1', num_hidden = 2)
In this way I get a symbolic variable P
that is dependent on the symbolic variable X
. In other words, I have computational graph that can be used to define a model and execute such operations as fit
and predict
.
Now, I would like to express P
through X
in a different way. In more detail, instead of using the high-level functionality (like FullyConnected
), I would like to specify relations between P
and X
"explicitly", using low-level tensor operations (like matrix multiplication) and symbolic variables representing model parameters (lake weight matrix).
For example to achieve the same as above, I have tried the followig:
W = mx.sym.Variable('W')
B = mx.sym.Variable('B')
P = mx.sym.broadcast_plus(mx.sym.dot(X, W), B)
However, P
obtained this way is not equivalent to P
obtained earlier. I cannot use it the same way. In particular, as far as I understand, MXNet is complaining that W
and B
do not have values (which makes sense).
I have also tried to declare W
and B
in another way (so that they do have values):
w = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
b = np.array([7.0, 8.0])
W = mx.nd.array(w)
B = mx.nd.array(b)
It does not work as well. I guess that MXNet complains because it expects a symbolic variable but it gets nd-arrays instead.
So, my question is how to build a model using low-level tensor operations (like matrix multiplication) and explicit objects representing model parameters (like weight matrices).
You might want to take a look at Gluon API. For example here is a guide for building MLP from scratch, including allocating the parameters:
#######################
# Allocate parameters for the first hidden layer
#######################
W1 = nd.random_normal(shape=(num_inputs, num_hidden), scale=weight_scale, ctx=model_ctx)
b1 = nd.random_normal(shape=num_hidden, scale=weight_scale, ctx=model_ctx)
params = [W1, b1, ...]
Attaching them to the automatic gradient
for param in params:
param.attach_grad()
Define the model:
def net(X):
#######################
# Compute the first hidden layer
#######################
h1_linear = nd.dot(X, W1) + b1
...
and execute it
epochs = 10
learning_rate = .001
smoothing_constant = .01
for e in range(epochs):
...
for i, (data, label) in enumerate(train_data):
data = data.as_in_context(model_ctx).reshape((-1, 784))
label = label.as_in_context(model_ctx)
...
with autograd.record():
output = net(data)
loss = softmax_cross_entropy(output, label_one_hot)
loss.backward()
SGD(params, learning_rate)
You can see the full example in the straight dope:
http://gluon.mxnet.io/chapter03_deep-neural-networks/mlp-scratch.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With