How to calculate the number of parameters of AlexNet?

i haven't found a calculation of parameters (weights + biases) of AlexNet so I tried to calculate it, but I'm not sure if its correct:

conv1: (11*11)*3*96 + 96 = 34944

conv2: (5*5)*96*256 + 256 = 614656

conv3: (3*3)*256*384 + 384 = 885120

conv4: (3*3)*384*384 + 384 = 1327488

conv5: (3*3)*384*256 + 256 = 884992

fc1: (6*6)*256*4096 + 4096 = 37752832

fc2: 4096*4096 + 4096 = 16781312

fc3: 4096*1000 + 1000 = 4097000

this results in a total amount of 62378344 parameters. Is that calculation right?

3 Answers

Your calculations are correct. We came up with the exact same number independently while writing this blog post. I have also added the final table from the post

Slide 8 in this presentation states it has 60M parameters, so I think you're at least in the ball park. http://vision.stanford.edu/teaching/cs231b_spring1415/slides/alexnet_tugce_kyunghee.pdf

According to the diagram in their paper, some of the layers use grouping. Therefore, not all features of one layer communicate with the next. This means e.g. for conv2, you should have only (5*5)*48*256 + 256 = 307,456 features.

I'm not sure if all newer implementations include the grouping. It was an optimization they used to let the network train in parallel on two GPUs, but modern GPUs have more resources for training and fit the network comfortably without grouping.

