Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how does theano.scan's updates work?

theano.scan return two variables: values variable and updates variable. For example,

a = theano.shared(1)

values, updates = theano.scan(fn=lambda a:a+1, outputs_info=a,  n_steps=10)

However, I notice that in most of the examples I work with, the updates variable is empty. It seems only when we write the function in theano.scan is a certain way, we get the updates. For example,

a = theano.shared(1)

values, updates = theano.scan(lambda: {a: a+1}, n_steps=10)

Can someone explain to me why in the first example the updates is empty, but in the second example, the updates variable is not empty? and more generally, how does the updates variable in theano.scan work? Thanks.

like image 952
user5016984 Avatar asked Oct 06 '15 18:10

user5016984


2 Answers

To complement Daniel's answer, if you want to compute outputs and updates in theano scan at the same time, look at this example.

This code loops over a sequence, computing the sum of its elements and updates a shared variable t (length of the sentence)

import theano
import numpy as np

t = theano.shared(0)
s = theano.tensor.vector('v')

def rec(s, first, t):
    first = s + first
    second = s
    return (first, second), {t: t+1}

first = np.float32(0)

(firsts, seconds), updates = theano.scan(
    fn=rec,
    sequences=s,
    outputs_info=[first, None],
    non_sequences=t)

f = theano.function([s], [firsts, seconds], updates=updates, allow_input_downcast=True)

v = np.arange(10)

print f(v)
print t.get_value()

The output of this code is

[array([  0.,   1.,   3.,   6.,  10.,  15.,  21.,  28.,  36.,  45.], dtype=float32), 
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)]
10

rec function outputs a tuple and a dictionary. Scanning over a sequence will both compute the outputs and add the dictionary to the updates, allowing you to create a function updating tand computing firsts and seconds at the same time.

like image 83
LeCodeDuGui Avatar answered Nov 05 '22 19:11

LeCodeDuGui


Consider the following four variations (this code can be executed to observe the differences) and analysis below.

import theano


def v1a():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda x: x + 1, outputs_info=a, n_steps=10)
    f = theano.function([], outputs=outputs)
    print f(), a.get_value()


def v1b():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda x: x + 1, outputs_info=a, n_steps=10)
    f = theano.function([], outputs=outputs, updates=updates)
    print f(), a.get_value()


def v2a():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda: {a: a + 1}, n_steps=10)
    f = theano.function([], outputs=outputs)
    print f(), a.get_value()


def v2b():
    a = theano.shared(1)
    outputs, updates = theano.scan(lambda: {a: a + 1}, n_steps=10)
    f = theano.function([], outputs=outputs, updates=updates)
    print f(), a.get_value()


def main():
    v1a()
    v1b()
    v2a()
    v2b()


main()

The output of this code is

[ 2  3  4  5  6  7  8  9 10 11] 1
[ 2  3  4  5  6  7  8  9 10 11] 1
[] 1
[] 11

The v1x variations use lambda x: x + 1. the result of the lambda function is a symbolic variable whose value is 1 greater than the input. The name of the lambda function's parameter has been changed to avoid shadowing the shared variable name. In these variations the shared variable is not used or manipulated in any way by the scan, other than using it as the initial value of the recurrent symbolic variable incremented by the scan step function.

The v2x variations use lambda {a: a + 1}. The result of the lambda function is a dictionary that explains how to update the shared variable a.

The updates from the v1x variations is empty because we have not returned a dictionary from the step function defining any shared variable updates. The outputs from the v2x variations is empty because we have not provided any symbolic output from the step function. updates only has use if the step function returns a shared variable update expression dictionary (as in v2x) and outputs only has use if the step function returns a symbolic variable output (as in v1x).

When a dictionary is returned, it will have no effect if not provided to theano.function. Note that the shared variable has not been updated in v2a but it has been updated in v2b.

like image 41
Daniel Renshaw Avatar answered Nov 05 '22 20:11

Daniel Renshaw