Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keras: extracting weights using get_weights function

Tags:

I would like to extract weights of 1d CNN layer, and understand how exactly the prediction values are computed. I am not able to re-produce the prediction values using the weights from get_weights() function.

In order to explain my understanding, here is a small data set.

n_filter = 64
kernel_size = 10
len_timeseries = 123
n_feature = 3
X = np.random.random(sample_size*len_timeseries*n_feature).reshape(sample_size,len_timeseries,n_feature)
y = np.random.random(sample_size*(len_timeseries-kernel_size+1)*n_filter).reshape(sample_size,
                                                                                  (len_timeseries-kernel_size+1),
                                                                                  n_filter)

Now, create a simple 1d CNN model as:

model = Sequential()
model.add(Conv1D(n_filter,kernel_size,
                 input_shape=(len_timeseries,n_feature)))
model.compile(loss="mse",optimizer="adam")

Fit the model and predict the values of X as:

model.fit(X,y,nb_epoch=1)
y_pred = model.predict(X)

The dimension of y_pred is (1000, 114, 64) as it should.

Now, I want to reproduce the value of y_pred[irow,0,ilayer]] using weights stored in model.layer. As there is only single layer, len(model.layer)=1. So I extract the weights from the first and the only layer as:

weight = model.layers[0].get_weights()
print(len(weight))
> 2 
weight0 = np.array(weight[0])
print(weight0.shape)
> (10, 1, 3, 64)
weight1 = np.array(weight[1])
print(weight1.shape)
> (64,)

The weight has length 2 and I assume that the 0th position contain the weights for features and the 1st position contain the bias. As the weight0.shape=(kernel_size,1,n_feature,n_filter), I thought that I can obtain the values of y_pred[irow,0,ilayer] by:

ifilter = 0
irow = 0
y_pred_by_hand = weight1[ifilter] + np.sum( weight0[:,0,:,ifilter] * X[irow,:kernel_size,:])
y_pred_by_hand
> 0.5124888777

However, this value is quite different from y_pred[irow,0,ifilter] as:

 y_pred[irow,0,ifilter]
 >0.408206

Please let me know where I got wrong.

like image 957
FairyOnIce Avatar asked Jul 14 '17 02:07

FairyOnIce


1 Answers

You have misunderstood the weights attribute here. What you are looking for is the output attribute of the layer which is the result given by model.predict. This can be obtained by layer.output. Typically a Layer is fed with an input tensor and is acted upon by the weights matrix which depends on the type of layer being used. This computation gives an output tensor which is what you are looking for.

For example consider a simple Dense layer with input tensor A of shape (1,3), an output sigmoid layer emitting a tensor B (1,1) and a weight matrix W. The shape of W is determined based on the input and output shapes. So in this case a Dense layer does A matmul W and the result of this is going to be the prediction B. W's shape will be determined as (3,1) only which can result in an output shape of (1,1). So what you are looking for is B, however you are trying to access W.

like image 181
scarecrow Avatar answered Oct 01 '22 03:10

scarecrow