How to implement multivariate linear stochastic gradient descent algorithm in tensorflow?

Tags:

I started with simple implementation of single variable linear gradient descent but don't know to extend it to multivariate stochastic gradient descent algorithm ?

Single variable linear regression

import tensorflow as tf
import numpy as np

# create random data
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data * 0.5

# Find values for W that compute y_data = W * x_data 
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
y = W * x_data

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

# Before starting, initialize the variables
init = tf.initialize_all_variables()

# Launch the graph.
sess = tf.Session()
sess.run(init)

# Fit the line.
for step in xrange(2001):
    sess.run(train)
    if step % 200 == 0:
        print(step, sess.run(W))

270

asked Mar 16 '16 09:03

NEW USER

1 Answers

You have two part in your question:

How to change this problem to a higher dimension space.
How to change from the batch gradient descent to a stochastic gradient descent.

To get a higher dimensional setting, you can define your linear problem y = <x, w>. Then, you just need to change the dimension of your Variable W to match the one of w and replace the multiplication W*x_data by a scalar product tf.matmul(x_data, W) and your code should run just fine.

To change the learning method to a stochastic gradient descent, you need to abstract the input of your cost function by using tf.placeholder.
Once you have defined X and y_ to hold your input at each step, you can construct the same cost function. Then, you need to call your step by feeding the proper mini-batch of your data.

Here is an example of how you could implement such behavior and it should show that W quickly converges to w.

import tensorflow as tf
import numpy as np

# Define dimensions
d = 10     # Size of the parameter space
N = 1000   # Number of data sample

# create random data
w = .5*np.ones(d)
x_data = np.random.random((N, d)).astype(np.float32)
y_data = x_data.dot(w).reshape((-1, 1))

# Define placeholders to feed mini_batches
X = tf.placeholder(tf.float32, shape=[None, d], name='X')
y_ = tf.placeholder(tf.float32, shape=[None, 1], name='y')

# Find values for W that compute y_data = <x, W>
W = tf.Variable(tf.random_uniform([d, 1], -1.0, 1.0))
y = tf.matmul(X, W, name='y_pred')

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y_ - y))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

# Before starting, initialize the variables
init = tf.initialize_all_variables()

# Launch the graph.
sess = tf.Session()
sess.run(init)

# Fit the line.
mini_batch_size = 100
n_batch = N // mini_batch_size + (N % mini_batch_size != 0)
for step in range(2001):
    i_batch = (step % n_batch)*mini_batch_size
    batch = x_data[i_batch:i_batch+mini_batch_size], y_data[i_batch:i_batch+mini_batch_size]
    sess.run(train, feed_dict={X: batch[0], y_: batch[1]})
    if step % 200 == 0:
        print(step, sess.run(W))

Two side notes:

The implementation below is called a mini-batch gradient descent as at each step, the gradient is computed using a subset of our data of size mini_batch_size. This is a variant from the stochastic gradient descent that is usually used to stabilize the estimation of the gradient at each step. The stochastic gradient descent can be obtained by setting mini_batch_size = 1.
The dataset can be shuffle at every epoch to get an implementation closer to the theoretical consideration. Some recent work also consider only using one pass through your dataset as it prevent over-fitting. For a more mathematical and detailed explanation, you can see Bottou12. This can be easily change according to your problem setup and the statistic property your are looking for.

183

answered Nov 03 '22 01:11

Thomas Moreau

Related questions
                            
                                Find the minimum and maximum indices of a list given a condition
                            
                                How to get a single byte in a string of bytes, without converting to int
                            
                                Shapely Split LineStrings at Intersections with other LineStrings
                            
                                Get full computer name from a network drive letter in python
                            
                                Can't accurately calculate pi on Python
                            
                                How do I follow python PEP8 regarding line breaks, and how important is it?
                            
                                Python pandas plot time-series with gap
                            
                                Where to put exception handling in python
                            
                                Selenium scraping with multiple urls
                            
                                How decorators work with classes in python
                            
                                python-markdown doesn't recognize code block?
                            
                                Two functions, One generator
                            
                                What is an object reference in Python?
                            
                                Why do I get this error "TypeError: 'method' object is not iterable"?
                            
                                Feeding tensors for training vs validation data
                            
                                how to check if non-key attribute already exists in dynamodb using ConditionExpression?
                            
                                pandas scatterplots: how to make unfilled symbols
                            
                                Collecting like term of an expression in Sympy
                            
                                PANDAS GroupBy Removing Header
                            
                                Tensorflow 0.7.1 with Cuda Toolkit 7.5 and cuDNN 7.0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to implement multivariate linear stochastic gradient descent algorithm in tensorflow?

Tags:

python

machine-learning

tensorflow

linear-regression

NEW USER

People also ask

1 Answers

Thomas Moreau

Recent Activity

Donate For Us