I can create a model using the pre-build high-level functions like FullyConnected. For example:
X = mx.sym.Variable('data')
P = mx.sym.FullyConnected(data = X, name = 'fc1', num_hidden = 2)
In this way I get a symbolic variable P that is dependent on the symbolic variable X. In other words, I have computational graph that can be used to define a model and execute such operations as fit and predict.
Now, I would like to express P through X in a different way. In more detail, instead of using the high-level functionality (like FullyConnected), I would like to specify relations between P and X "explicitly", using low-level tensor operations (like matrix multiplication) and symbolic variables representing model parameters (lake weight matrix).
For example to achieve the same as above, I have tried the followig:
W = mx.sym.Variable('W')
B = mx.sym.Variable('B')
P = mx.sym.broadcast_plus(mx.sym.dot(X, W), B)
However, P obtained this way is not equivalent to P obtained earlier. I cannot use it the same way. In particular, as far as I understand, MXNet is complaining that W and B do not have values (which makes sense).
I have also tried to declare W and B in another way (so that they do have values):
w = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
b = np.array([7.0, 8.0])
W = mx.nd.array(w)
B = mx.nd.array(b)
It does not work as well. I guess that MXNet complains because it expects a symbolic variable but it gets nd-arrays instead.
So, my question is how to build a model using low-level tensor operations (like matrix multiplication) and explicit objects representing model parameters (like weight matrices).
You might want to take a look at Gluon API. For example here is a guide for building MLP from scratch, including allocating the parameters:
#######################
# Allocate parameters for the first hidden layer
#######################
W1 = nd.random_normal(shape=(num_inputs, num_hidden), scale=weight_scale, ctx=model_ctx)
b1 = nd.random_normal(shape=num_hidden, scale=weight_scale, ctx=model_ctx)
params = [W1, b1, ...]
Attaching them to the automatic gradient
for param in params:
param.attach_grad()
Define the model:
def net(X):
#######################
# Compute the first hidden layer
#######################
h1_linear = nd.dot(X, W1) + b1
...
and execute it
epochs = 10
learning_rate = .001
smoothing_constant = .01
for e in range(epochs):
...
for i, (data, label) in enumerate(train_data):
data = data.as_in_context(model_ctx).reshape((-1, 784))
label = label.as_in_context(model_ctx)
...
with autograd.record():
output = net(data)
loss = softmax_cross_entropy(output, label_one_hot)
loss.backward()
SGD(params, learning_rate)
You can see the full example in the straight dope:
http://gluon.mxnet.io/chapter03_deep-neural-networks/mlp-scratch.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With