Evaluate a function in a sliding window with Keras

Question

I'm trying to extend a matching matching algorithm across a sequence. My matches are 20 units long and have 4 channels at each timepoint. I have built a model that encapsulates the matching, I just can't figure out how to use that in a sliding window to apply it across a longer sequence to find the matches within the sequence.

I have 2 (20, 4) input tensors (query and target) that I concatenate, add, flatten, and then apply a simple dense layer. I have data at this stage to train with 100K query, target pairs.

def sum_seqs(seqs):
    return K.sum(seqs, axis=3)

def pad_dims(seq):
    return K.expand_dims(seq, axis=3)

def pad_outshape(in_shape):
    return (in_shape[0], in_shape[1], in_shape[2], 1)


query = Input((20, 4))
query_pad = Lambda(pad_dims, output_shape=pad_outshape, name='gpad')(query)

target = Input((20,4))
target_pad = Lambda(pad_dims, output_shape=pad_outshape)(target)

matching = Concatenate(axis = 3)([query_pad, target_pad])
matching = Lambda(sum_seqs)(matching)

matching = Flatten()(matching)
matching = Dropout(0.1)(matching)
matching = Dense(1, activation = 'sigmoid')(matching)

match_model = Model([query, target], matching)

This works perfectly. Now I want to use this pre-trained model to search a longer target sequence with varying query sequences.

It seems it should be something like:

long_target = Input((100, 4))

short_target = Input((20, 4))
choose_query = Input((20, 4))

spec_match = match_model([choose_query, short_target])

mdl = TimeDistributed(spec_match)(long_target)

But TimeDistributed takes a Layer not a Tensor. Is there a wrapper I'm missing? Am I going about this the wrong way? Do I need to reformulate this as a convolution problem somehow?

Continued experimentation: After a day of beating my head against the keyboard it is clear that both TimeDistributed and backend.rnn only allow you to apply a model/layer to a single time-slice of the data. It doesn't seem like there is a way to do this. It looks like the only thing that can "walk" across multiple slices of the time dimension is a Conv1D.

So, I reframed my problem as a convolution but that doesn't work well either. I was able to building a Conv1D filter that it would match a specific query. This worked reasonably well and it did allow me to scan longer sequences and get matches. BUT each filter is unique to each query tensor and there doesn't seem to be a way to go from a novel query to the appropriate filter weights without training a whole new Conv1D layer. Since my goal is to find new querys which match the most targets this doesn't help much.

Since my "matching" requires the interaction of the target AND the query at each window there doesn't seem to be a way I can get an interaction of a 20-length query tensor at each window across a 100-length target tensor through Conv1D.

Is there any way to do this sliding window type evaluation in Keras/tensorflow? It seems like something so simple yet so far away. Is there a way I can do this that I'm not finding?

Responses and further experimentation.

The solutions from @today and @nuric work but they end up replicating the input target data in a tiling type fashion. So, for a query of length m there will be a little under m copies of the input data in the graph. I was hoping to find a solution that would actually "slide" the evaluation across the target without the duplication.

Here's a version of the Conv1D almost solution I came up with.

query_weights = []

for query, (targets, scores) in query_target_gen():
    single_query_model = Sequential()
    single_query_model.add(Conv1D(1, 20, input_shape = (20, 4)))
    single_query_model.add(Flatten())

    single_query_model.fit(targets, scores)

    query_weights.append(single_query_model.layers[0].get_weights())

multi_query_model_long_targets = Sequential()
multi_query_model_long_targets.add(Conv1D(len(query_weights), 20, input_shape = (100, 4)))

multi_query_model_long_targets.layers[0].set_weights(combine_weights(query_weights))

multi_query_model_long_targets.summary()

The combine_weights function just does some unpacking and matrix rearrangement to stack the filters in the way Conv1D wants.

This solution fixes the data duplication issue but it screws me in other ways. One is data based ... my data contains many query, target pairs but it tends to be the same target many querys, since it is easier to generate the real-world data in that orientation. So, doing it this way makes the training difficult. Second, this assumes that each query works in an independent way, when in reality, I know that the query, target pairing is what is actually important. So it makes sense to use a model that can look at many examples of the pairs, and not individuals.

Is there a way to combine both methods? Is there a way to make it so Conv1D takes both the long target tensor combine it with the constant query as it walks along the sequence?

Yu-Yang · Accepted Answer

Just to provide an alternative solution using Keras backend functions.

You can also generate sliding windows with K.arange and K.map_fn:

def sliding_windows(inputs):
    target, query = inputs
    target_length = K.shape(target)[1]  # variable-length sequence, shape is a TF tensor
    query_length = K.int_shape(query)[1]
    num_windows = target_length - query_length + 1  # number of windows is also variable

    # slice the target into consecutive windows
    start_indices = K.arange(num_windows)
    windows = K.map_fn(lambda t: target[:, t:(t + query_length), :],
                       start_indices,
                       dtype=K.floatx())

    # `windows` is a tensor of shape (num_windows, batch_size, query_length, ...)
    # so we need to change the batch axis back to axis 0
    windows = K.permute_dimensions(windows, (1, 0, 2, 3))

    # repeat query for `num_windows` times so that it could be merged with `windows` later
    query = K.expand_dims(query, 1)
    query = K.tile(query, [1, num_windows, 1, 1])

    # just a hack to force the dimensions 2 to be known (required by Flatten layer)
    windows = K.reshape(windows, shape=K.shape(query))
    return [windows, query]

To use it:

long_target = Input((None, 4))
choose_query = Input((20, 4))
windows, query = Lambda(sliding_windows)([long_target, choose_query])

Given your pretrained match_model, the problem with TimeDistributed is that it cannot wrap a Keras Model with multiple inputs.

However, since the logic matching target and query is implemented in the layers after Concatenate, you can collect these layers into a Model, and apply TimeDistributed to it:

submodel_input = Input((20, 4, 2))
x = submodel_input
for layer in match_model.layers[-4:]:  # the `Lambda(sum_seqs)` layer
    x = layer(x)
submodel = Model(submodel_input, x)

Now you just need to process and merge the outputs of sliding_windows in the same way as in match_model:

long_target = Input((None, 4))
choose_query = Input((20, 4))
windows, query = Lambda(sliding_windows)([long_target, choose_query])

windows_pad = Lambda(lambda x: K.expand_dims(x))(windows)
query_pad = Lambda(lambda x: K.expand_dims(x))(query)
merged = Concatenate()([windows_pad, query_pad])

match_scores = TimeDistributed(submodel)(merged)
max_score = GlobalMaxPooling1D()(match_scores)
model = Model([long_target, choose_query], max_score)

model can then be used in an end-to-end fashion for matching long targets.

You can also verify that the output of model is indeed the maximum of the matching scores by applying match_model to sliding windows:

target_arr = np.random.rand(32, 100, 4)
query_arr = np.random.rand(32, 20, 4)

match_model_scores = np.array([
    match_model.predict([target_arr[:, t:t + 20, :], query_arr])
    for t in range(81)
])
scores = model.predict([target_arr, query_arr])

print(np.allclose(scores, match_model_scores.max(axis=0)))
True

today · Answer

Note: look at @Yu-Yang's solution. It is much better.

Well, as I mentioned in my comment, you can use tf.exctract_image_patches() (if the documentation seems a bit vague read this answer on SO) to extract patches (Edit: I just added two variables win_len and feat_len and changed 100 to None and 81 to -1 to make it work with the target sequences of arbitrary length):

import tensorflow as tf
from keras import layers, models
import keras.backend as K

win_len = 20   # window length
feat_len = 4   # features length

def extract_patches(data):
    data = K.expand_dims(data, axis=3)
    patches = tf.extract_image_patches(data, ksizes=[1, win_len, feat_len, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1], padding='VALID')
    return patches

target = layers.Input((None, feat_len))
patches = layers.Lambda(extract_patches)(target)
patches = layers.Reshape((-1, win_len, feat_len))(patches)

model = models.Model([target], [patches])
model.summary()

Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, 4)           0         
_________________________________________________________________
lambda_2 (Lambda)            (None, None, None, 80)    0         
_________________________________________________________________
reshape_2 (Reshape)          (None, None, 20, 4)       0         
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

For example, if input target has a shape of (100, 4), the output shape is (81, 20, 4).

Here is a test:

import numpy as np

# an array consisting of numbers 0 to 399 with shape (100, 4)
target = np.arange(1*100*4*1).reshape(1, 100, 4)
print(model.predict(a))

Here is the output:

[[[[  0.   1.   2.   3.]
   [  4.   5.   6.   7.]
   [  8.   9.  10.  11.]
   ...
   [ 68.  69.  70.  71.]
   [ 72.  73.  74.  75.]
   [ 76.  77.  78.  79.]]

  [[  4.   5.   6.   7.]
   [  8.   9.  10.  11.]
   [ 12.  13.  14.  15.]
   ...
   [ 72.  73.  74.  75.]
   [ 76.  77.  78.  79.]
   [ 80.  81.  82.  83.]]

  [[  8.   9.  10.  11.]
   [ 12.  13.  14.  15.]
   [ 16.  17.  18.  19.]
   ...
   [ 76.  77.  78.  79.]
   [ 80.  81.  82.  83.]
   [ 84.  85.  86.  87.]]

  ...

  [[312. 313. 314. 315.]
   [316. 317. 318. 319.]
   [320. 321. 322. 323.]
   ...
   [380. 381. 382. 383.]
   [384. 385. 386. 387.]
   [388. 389. 390. 391.]]

  [[316. 317. 318. 319.]
   [320. 321. 322. 323.]
   [324. 325. 326. 327.]
   ...
   [384. 385. 386. 387.]
   [388. 389. 390. 391.]
   [392. 393. 394. 395.]]

  [[320. 321. 322. 323.]
   [324. 325. 326. 327.]
   [328. 329. 330. 331.]
   ...
   [388. 389. 390. 391.]
   [392. 393. 394. 395.]
   [396. 397. 398. 399.]]]]

Evaluate a function in a sliding window with Keras

Tags:

python

tensorflow

keras

conv-neural-network

sliding-window

JudoWill

2 Answers

Yu-Yang

today

Recent Activity

Donate For Us

Evaluate a function in a sliding window with Keras

Tags:

python

tensorflow

keras

conv-neural-network

sliding-window

JudoWill

2 Answers

Yu-Yang

today

Related questions

Recent Activity

Donate For Us