Keras model.summary() result - Understanding the # of Parameters

Tags:

I have a simple NN model for detecting hand-written digits from a 28x28px image written in python using Keras (Theano backend):

Click to copy

model0 = Sequential()

#number of epochs to train for
nb_epoch = 12
#amount of data each iteration in an epoch sees
batch_size = 128

model0.add(Flatten(input_shape=(1, img_rows, img_cols)))
model0.add(Dense(nb_classes))
model0.add(Activation('softmax'))
model0.compile(loss='categorical_crossentropy', 
         optimizer='sgd',
         metrics=['accuracy'])

model0.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
      verbose=1, validation_data=(X_test, Y_test))

score = model0.evaluate(X_test, Y_test, verbose=0)

print('Test score:', score[0])
print('Test accuracy:', score[1])

This runs well and I get ~90% accuracy. I then perform the following command to get a summary of my network's structure by doing print(model0.summary()). This outputs the following:

Click to copy

Layer (type)         Output Shape   Param #     Connected to                     
=====================================================================
flatten_1 (Flatten)   (None, 784)     0           flatten_input_1[0][0]            
dense_1 (Dense)     (None, 10)       7850        flatten_1[0][0]                  
activation_1        (None, 10)          0           dense_1[0][0]                    
======================================================================
Total params: 7850

I don't understand how they get to 7850 total params and what that actually means?

367

asked Apr 29 '16 20:04

user3501476

Video Answer

3 Answers

The number of parameters is 7850 because with every hidden unit you have 784 input weights and one weight of connection with bias. This means that every hidden unit gives you 785 parameters. You have 10 units so it sums up to 7850.

The role of this additional bias term is really important. It significantly increases the capacity of your model. You can read details e.g. here Role of Bias in Neural Networks.

answered Oct 21 '22 15:10

Marcin Możejko

I feed a 514 dimensional real-valued input to a Sequential model in Keras. My model is constructed in following way :

Click to copy

    predictivemodel = Sequential()
    predictivemodel.add(Dense(514, input_dim=514, W_regularizer=WeightRegularizer(l1=0.000001,l2=0.000001), init='normal'))
    predictivemodel.add(Dense(257, W_regularizer=WeightRegularizer(l1=0.000001,l2=0.000001), init='normal'))
    predictivemodel.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

When I print model.summary() I get following result:

Click to copy

Layer (type)    Output Shape  Param #     Connected to                   
================================================================
dense_1 (Dense) (None, 514)   264710      dense_input_1[0][0]              
________________________________________________________________
activation_1    (None, 514)   0           dense_1[0][0]                    
________________________________________________________________
dense_2 (Dense) (None, 257)   132355      activation_1[0][0]               
================================================================
Total params: 397065
________________________________________________________________

For the dense_1 layer , number of params is 264710. This is obtained as : 514 (input values) * 514 (neurons in the first layer) + 514 (bias values)

For dense_2 layer, number of params is 132355. This is obtained as : 514 (input values) * 257 (neurons in the second layer) + 257 (bias values for neurons in the second layer)

answered Oct 21 '22 14:10

tauseef_CuriousGuy

For Dense Layers:

Click to copy

output_size * (input_size + 1) == number_parameters

For Conv Layers:

Click to copy

output_channels * (input_channels * window_size + 1) == number_parameters

Consider following example,

Click to copy

model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
Conv2D(64, (3, 3), activation='relu'),
Conv2D(128, (3, 3), activation='relu'),
Dense(num_classes, activation='softmax')
])

model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 222, 222, 32)      896       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 220, 220, 64)      18496     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 218, 218, 128)     73856     
_________________________________________________________________
dense_9 (Dense)              (None, 218, 218, 10)      1290      
=================================================================

Calculating params,

Click to copy

assert 32 * (3 * (3*3) + 1) == 896
assert 64 * (32 * (3*3) + 1) == 18496
assert 128 * (64 * (3*3) + 1) == 73856
assert num_classes * (128 + 1) == 1290

answered Oct 21 '22 16:10

Ashiq Imran

Related questions
                            
                                "Failed building wheel for psycopg2" - MacOSX using virtualenv and pip
                            
                                Parallel Python: What is a callback?
                            
                                Find the item with maximum occurrences in a list [duplicate]
                            
                                AWS Lambda : OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k - While using Google Custom Search API
                            
                                "Series objects are mutable and cannot be hashed" error
                            
                                Rendering JSON objects using a Django template after an Ajax call
                            
                                pandas equivalent of np.where
                            
                                conda: remove all installed packages from base/root environment
                            
                                Optimal way to compute pairwise mutual information using numpy
                            
                                Jupyter notebook not running code. Stuck on In [*]
                            
                                Is there a way to put comments in multiline code?
                            
                                request.user returns a SimpleLazyObject, how do I "wake" it?
                            
                                What does "rc" in matplotlib's rcParams stand for? [closed]
                            
                                How to use argparse subparsers correctly?
                            
                                What's the difference between setup.py and setup.cfg in python projects
                            
                                Using python decorator with or without parentheses
                            
                                Tkinter: How to use threads to preventing main event loop from "freezing"
                            
                                Python threads all executing on a single core
                            
                                testing whether a Numpy array contains a given row
                            
                                Django migration with uuid field generates duplicated values

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras model.summary() result - Understanding the # of Parameters

Tags:

python

machine-learning

neural-network

keras

theano

user3501476

People also ask

Video Answer

3 Answers

Marcin Możejko

tauseef_CuriousGuy

Ashiq Imran

Recent Activity

Donate For Us