Lately I've been benchmarking some CNNs regarding time, # of multiply-add operations (MAC), # of parameters and model size. I have seen some similar SO questions (here and here) and in the latter, they suggest using Netscope CNN Analyzer. This tool allows me to calculate most of the things I need just by inputing my Caffe network definition.
However, the number of multiply-add operations of some architectures I've seen in papers and over the internet doesn't match what Netscope is outputting, whereas other architectures match. I'm always comparing either FLOPs or MAC with the MACC column in netscope, but there a ~10x factor that I'm forgetting at some point (check table bellow for more detail).
Architecture ---- MAC (paper/internet) ---- macc column in netscope
VGG 16 ~15.5G ~157G
GoogLeNet ~1.55G ~16G
Reference about GoogLeNet macc number and VGG16 macc number in Netscope.
Does anybody that used that tool could point me out on what mistake I'm doing while reading Netscope output?
Well, you could just multiply the depth of each input volume by the number of filters in each layer and add them together. In your case: 10+200+2000=2,210.
➡️ For a refresher on CNNs, you can check this cheatsheet. To calculate the FLOPs in a model, here are the rules: Convolutions - FLOPs = 2x Number of Kernel x Kernel Shape x Output Shape. Fully Connected Layers - FLOPs = 2x Input Size x Output Size.
I've found what was causing the discrepancy between Netscope and the information I'd found in papers. Most preset architectures in Nestcope were using a batch size of 10 (this is the case for VGG and GoogLeNet, for example), therefore the x10 factor that multiplied the number of mult-add operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With