The first layer of my neural network is like this:
model.add(Conv1D(filters=40,
kernel_size=25,
input_shape=x_train.shape[1:],
activation='relu',
kernel_regularizer=regularizers.l2(5e-6),
strides=1))
if my input shape is (600,10)
i get (None, 576, 40)
as output shape
if my input shape is (6000,1)
i get (None, 5976, 40)
as output shape
so my question is what exactly is happening here? is the first example simply ignoring 90% of the input?
For the above example, with an input shape of 8, 1(8 inputs, 1 feature), the output of Conv1D(filters = 16) will produce 16 different outcomes resulting in a shape of (8, 16).
The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch_size, height, width, channels) while channels_first corresponds to inputs with shape (batch_size, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/. keras/keras.
First, we'll load the dataset and check the x input dimensions. The next important step is to reshape the x input data. We'll create one-dimensional vectors from each row of x input data. We'll check the labels of y output data and find out the class numbers that will be defined in a model output layer.
However, there are two options to define the input layer. We can use the InputLayer() class to explicitly define the input layer of a Keras sequential model or we can use the Dense() class with the input_shape argument that will add the input layer behind the scene.
It is not "ignoring" a 90% of the input, the problem is simply that if you perform a 1-dimensional convolution with a kernel of size K over an input of size X the result of the convolution will have size X - K + 1. If you want the output to have the same size as the input, then you need to extend or "pad" your data. There are several strategies for that, such as add zeros, replicate the value at the ends or wrap around. Keras' Convolution1D
has a padding
parameter that you can set to "valid"
(the default, no padding), "same"
(add zeros at both sides of the input to obtain the same output size as the input) and "causal"
(padding with zeros at one end only, idea taken from WaveNet).
Update
About the questions in your comments. So you say your input is (600, 10)
. That, I assume, is the size of one example, and you have a batch of examples with size (N, 600, 10)
. From the point of view of the convolution operation, this means you have N
examples, each of with a length of at most 600
(this "length" may be time or whatever else, it's just the dimension across which the convolution works) and, at each of these 600
points, you have vectors of size 10
. Each of these vectors is considered an atomic sample with 10
features (e.g. price, heigh, size, whatever), or, as is sometimes called in the context of convolution, "channels" (from the RGB channels used in 2D image convolution).
The point is, the convolution has a kernel size and a number of output channels, which is the filters
parameter in Keras. In your example, what the convolution does is take every possible slice of 25 contiguous 10-vectors and produce a single 40-vector for each (that, for every example in the batch, of course). So you pass from having 10 features or channels in your input to having 40 after the convolution. It's not that it's using only one of the 10 elements in the last dimension, it's using all of them to produce the output.
If the meaning of the dimensions in your input is not what the convolution is interpreting, or if the operation it is performing is not what you were expecting, you may need to either reshape your input or use a different kind of layer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With