I'm trying desperately to understand the taps argument in the theano.scan function. Unfortunately I'm not able to come up with a specific question.
I just don't understand the "taps" mechanism. Well I ok. I know in which order the sequences are passed to the function, but I don't know the meaning. For example (I borrowed this code from another Question Python - Theano scan() function):
import numpy as np
import theano
import theano.tensor as T
def addf(a1,a2):
print(a1)
print(a2)
return a1+a2
i = T.iscalar('i')
x0 = T.ivector('x0')
step= T.iscalar('step')
results, updates = theano.scan(fn=addf,
outputs_info=[dict(initial=x0, taps=[-3])],
non_sequences=step,
n_steps=i)
f=theano.function([x0, step,i],results)
input = [2, 3]
print(f(input, 2, 20))
Setting taps to -1 does make sense to me. As far as I understand it's the same as not setting the taps value and the whole vector 'x0' is being passed to the addf function. x0 will then be added with the "step" parameter (int 2 which will be broadcasted to the same size). In the next iteration the result [4, 5] will be the input and so on which yields the following output:
[[ 4 5]
[ 6 7]
[ 8 9]
[10 11]
[12 13]
[14 15]
[16 17]
[18 19]
[20 21]
[22 23]
[24 25]
[26 27]
[28 29]
[30 31]
[32 33]
[34 35]
[36 37]
[38 39]
[40 41]
[42 43]]
Setting taps to -3 however yields the following output:
[ 5 2 6 7 4 8 9 6 10 11 8 12 13 10 14 15 12 16 17]
I don't have any explanation how the scan function creates this output. Why is it just a list now? The "print(a1)" turns out to be as expected
x0[t-3]
Although I know that this is the value that a1 should have, I do not know how to interpret it. What is the t-3 th value of x0? The theano documentation doesn't seem to be all to detailed about the taps argument... so hopefully one of you guys will be.
Thx
To better understand the use of taps
, you should first understand how scan
uses the outputs_info
argument altogether and how the provided values for it (initial
to be exact) change the nature of the result.
scan
expects you to provide the type of output you expect from this operation (unless of course you dont have any initial values to provide and simply mention None
, in which case it will start the first round {step
} and the output is not passed as a parameter to the fn
in the successive rounds).
So scan
is used for iterative reduction over the provided sequences
. This means that at step
n (and with no taps
specified for either sequences
or outputs_info
), the given fn
will be applied to the nth elements of each of the sequences
along with the output(s) generated by the previous (n-1 th) step
. Hence the default value of taps
for sequences
is 0
and for outputs_info
is -1
.
Another way to look at it would be to consider all the sequences to consist of slices across their respective first dimension. So for a particular step, the current slice(s) of the sequence(s)
and the output slice of the previous step are passed to fn
and the computed output is added to the results as a new slice which would then be used for the next step
. It is obvious that each of the output slices would be of the same shape. And if you are providing an initial slice as part of outputs_info
then it should also have the same shape as that produced by the application of fn
. In your example, if output_info=[dict(initial=x0)]
, it would take [2, 3]
as the first slice and use it for the first step
as the argument a1
to addf
.
But quite often in signal processing (and elsewhere) you need more than just the last data points in time as causal information. Here I have used time just as a way to represent steps
. Anyway, this is where taps
is useful and helps in indicating exactly which data points from the sequences
and results
have to be used for the current step
. In your example, this means that for the current step
the 3rd last output should be passed to fn
.
And this is where you need to be careful in describing initial
for outputs_info
. Because scan will first split the initial
value into slices along the fist dimension. Then the first slice among this set of slices would be considered the earliest slice (3rd last in your example) required to compute the output of the first step
.
Lets assume in your example, taps=[-2]
and input = [2, 3]
. In this case, scan will split the input into slices and use the first slice (the value 2 here) as the argument a1
to addf
. The resulting value 4 would be added to the output and for the next step, the slices would include [2, 3, 4] of which the value 3 is on the second last (-2) tap. And so on. However, with taps=[-3]
and the same input
, there is one value missing which is like saying that you had collected the values at times (t-3) and (t-2) but didnt collect the value at (t-1).
So if you reckon your output to be of a certain shape, and you require multiple taps of the output beyond -1, then the value of initial
should be a list of elements of the required output shape and have exactly as many such elements as would be required to retrieve the earliest slice.
TLDR:
In your example, if you want to get 2d vectors as the result of each step
and are using taps=[-3]
, then input
should be a list of 3 such 2d vectors. If you want to get single valued results, then input
should be a list with 3 integers. A list with 2 integers does not make sense in this context at all. It would only make sense if taps
is either -2 or -1 or [-2, -1]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With