How to create 2-layers neural network using TensorFlow and python on MNIST data

Tags:

I'm a newbie in machine learning and I am following tensorflow's tutorial to create some simple Neural Networks which learn the MNIST data.

I have built a single layer network (following the tutotial), accuracy was about 0.92 which is ok for me. But then I added one more layer, the accuracy reduced to 0.113, which is very bad.

Below is the relation between 2 layers:

import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 784])

#layer 1
W1 = tf.Variable(tf.zeros([784, 100]))
b1 = tf.Variable(tf.zeros([100]))
y1 = tf.nn.softmax(tf.matmul(x, W1) + b1)

#layer 2
W2 = tf.Variable(tf.zeros([100, 10]))
b2 = tf.Variable(tf.zeros([10]))
y2 = tf.nn.softmax(tf.matmul(y1, W2) + b2)

#output
y = y2
y_ = tf.placeholder(tf.float32, [None, 10])

Is my structure fine? What is the reason that makes it perform so bad? How should I modify my network?

564

asked Jul 01 '16 04:07

Tai Christian

1 Answers

The input of the 2nd layer is the softmax of the output of the first layer. You don't want to do that.

You're forcing the sum of these values to be 1. If some value of tf.matmul(x, W1) + b1 is about 0 (and some certainly are) the softmax operation is lowering this value to be 0. Result: you're killing the gradient and nothing can flow trough these neurons.

If you remove the softmax between the layers (but leve it the softmax on the output layer if you want to consider the values as probability) your network will work fine.

Tl;dr:

import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 784])

#layer 1
W1 = tf.Variable(tf.zeros([784, 100]))
b1 = tf.Variable(tf.zeros([100]))
y1 = tf.matmul(x, W1) + b1 #remove softmax

#layer 2
W2 = tf.Variable(tf.zeros([100, 10]))
b2 = tf.Variable(tf.zeros([10]))
y2 = tf.nn.softmax(tf.matmul(y1, W2) + b2)

#output
y = y2
y_ = tf.placeholder(tf.float32, [None, 10])

191

answered Sep 26 '22 01:09

nessuno

Related questions
                            
                                When the Python interpreter deals with a .py file, is it different from dealing with a single statement?
                            
                                Always run rule in Snakefile (snakemake)
                            
                                How to load a video in opencv(python)
                            
                                How to run all tests with python manage.py test command in django
                            
                                How to fill a column conditionally to values in an other column being in a list?
                            
                                TypeError: 'numpy.float64' object is not iterable Keras
                            
                                How to count number of occurrences by using pyspark
                            
                                Python: Convert Dictionary of String Times to Date Times
                            
                                Does finally ensure some code gets run atomically, no matter what?
                            
                                remove known exact row in huge csv
                            
                                cv2.imread does not read jpg files
                            
                                why do i get a bad file descriptor error?
                            
                                Fast fuse of close points in a numpy-2d (vectorized)
                            
                                Python - is there a way to store an operation(+ - * /) in a list or as a variable?
                            
                                Python - Find center of object in an image
                            
                                are elements of an array in a set?
                            
                                How to implement a Global Python Logger?
                            
                                Python/Django date query: Unsupported lookup 'date' for DateField or join on the field not permitted
                            
                                xterm not working in mininet
                            
                                nvcc fatal : Value 'sm_61' is not defined for option 'gpu-architecture' error with theano

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to create 2-layers neural network using TensorFlow and python on MNIST data

Tags:

python

tensorflow

mnist

Tai Christian

People also ask

1 Answers

nessuno

Recent Activity

Donate For Us