Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

input dimensions to a one dimensional convolutional network in keras

Tags:

really finding it hard to understand the input dimensions to the convolutional 1d layer in keras:

Input shape

3D tensor with shape: (samples, steps, input_dim).

Output shape

3D tensor with shape: (samples, new_steps, nb_filter). steps value might have changed due to padding.

I want my network to take in a time series of prices (101, in order) and output 4 probabilities. My current non-convolutional network which does this fairly well (with a training set of 28000) looks like this:

standardModel = Sequential() standardModel.add(Dense(input_dim=101, output_dim=100, W_regularizer=l2(0.5), activation='sigmoid')) standardModel.add(Dense(4, W_regularizer=l2(0.7), activation='softmax')) 

To improve this, I want to make a feature map from the input layer which has a local receptive field of length 10. (and therefore has 10 shared weights and 1 shared bias). I then want to use max pooling and feed this in to a hidden layer of 40 or so neurons and then output this with 4 neurons with softmax in the outer layer.

picture (it's quite awful sorry!)

So ideally, the convolutional layer would take a 2d tensor of dimensions:

(minibatch_size, 101)

and output a 3d tensor of dimensions

(minibatch_size, 91, no_of_featuremaps)

However, the keras layer seems to require a dimension in the input called step. I've tried understanding this and still don't quite get it. In my case, should step be 1 as each step in the vector is an increase in the time by 1? Also, what is new_step?

In addition, how do you turn the output of the pooling layers (a 3d tensor) into input suitable for the standard hidden layer (i.e a Dense keras layer) in the form of a 2d tensor?

Update: After the very helpful suggestions given, I tried making a convolutional network like so:

conv = Sequential() conv.add(Convolution1D(64, 10, input_shape=(1,101))) conv.add(Activation('relu')) conv.add(MaxPooling1D(2)) conv.add(Flatten()) conv.add(Dense(10)) conv.add(Activation('tanh')) conv.add(Dense(4)) conv.add(Activation('softmax')) 

The line conv.Add(Flatten()) throws a range exceeds valid bounds error. Interestingly, this error is not thrown for just this code:

conv = Sequential() conv.add(Convolution1D(64, 10, input_shape=(1,101))) conv.add(Activation('relu')) conv.add(MaxPooling1D(2)) conv.add(Flatten()) 

doing

print conv.input_shape print conv.output_shape 

results in

(None, 1, 101 (None, -256) 

being returned

Update 2:

Changed

conv.add(Convolution1D(64, 10, input_shape=(1,101))) 

to

conv.add(Convolution1D(10, 10, input_shape=(101,1)) 

and it started working. However, is there any important different between inputting (None, 101, 1) to a 1d conv layer or (None, 1, 101) that I should be aware of? Why does (None, 1, 101) not work?

like image 913
Nick Avatar asked Jul 29 '16 10:07

Nick


People also ask

What is the input of 1D CNN?

In 1D CNN, kernel moves in 1 direction. Input and output data of 1D CNN is 2 dimensional.

What is input dimension in keras?

In Keras, input_dim refers to the Dimension of Input Layer / Number of Input Features model = Sequential() model.add(Dense(32, input_dim=784)) #or 3 in the current posted example above model.add(Activation('relu')) In Keras LSTM, it refers to the total Time Steps.

Does CNN input need to be square?

Most CNN training datasets have images that are not square. The standard method is to take a square crop out of it -- often picking a random square for training, and at test time to use multiple squares and aggregate the predictions (center + 4 corners is a classic).

What does input layer do in CNN?

The input layer of a neural network is composed of artificial input neurons, and brings the initial data into the system for further processing by subsequent layers of artificial neurons. The input layer is the very beginning of the workflow for the artificial neural network.


1 Answers

The reason why it look like this is that Keras designer intended to make 1-dimensional convolutional framework to be interpreted as a framework to deal with sequences. To fully understand the difference - try to imagine that you have a sequence of a multiple feature vectors. Then your output will be at least two dimensional - where first dimension is connected with time and other dimensions are connected with features. 1-dimensional convolutional framework was designed to in some way bold this time dimension and try to find the reoccuring patterns in data - rather than performing a classical multidimensional convolutional transformation.

In your case you must simply reshape your data to have shape (dataset_size, 101, 1) - because you have only one feature. It could be easly done using numpy.reshape function. To understand what does a new step mean - you must understand that you are doing the convolution over time - so you change the temporal structure of your data - which lead to new time-connected structure. In order to get your data to a format which is suitable for dense / static layers use keras.layers.flatten layer - the same as in classic convolutional case.

UPDATE: As I mentioned before - the first dimension of input is connected with time. So the difference between (1, 101) and (101, 1) lies in that in first case you have one time step with 101 features and in second - 101 timesteps with 1 feature. The problem which you mentioned after your first change has its origin in making pooling with size 2 on such input. Having only one timestep - you cannot pool any value on a time window of size 2 - simply because there is not enough timesteps to do that.

like image 74
Marcin Możejko Avatar answered Oct 06 '22 00:10

Marcin Możejko