Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python FLOPS calculation

Tags:

python

flops

I've been trying to get a standardized estimate of FLOPS across all of the computers that I've implemented a Python distributed processing program on. While I currently can calculate pystones quite fine, pystones are not particularly well known, and I'm not entirely sure how accurate they really are.

Thus, I need a way to calculate (or a module that already does it) FLOPS on a variety of machines, which may have any variety of CPU's, etc. Seeing as Python is an interpreted language, simply counting the time it takes to do a set number of operations won't perform on the level of, say, Linpack. While I don't particularly need to have the exact same estimates as one of the big 'names' in benchmarking, I'd like it to be reasonably close at least.

Thus, is there a way, or pre-existing module to allow me to get FLOPS? Otherwise, my only choice will be compiling into Cython, or trying to estimate the capabilities based on CPU clock speed...

like image 436
Doc Sohl Avatar asked Sep 07 '12 02:09

Doc Sohl


People also ask

How do you count FLOPs in Python?

log(n**2)*n**2-6*n**2 + 8 . It can be seen that this model is fairly close to the number of operations as captured by the CPU performance monitoring unit.

How do you calculate FLOPs?

Isolate one loop iteration. Then count all simple floating-point additions, multiplications, divisions, etc. For example, y = x * 2 * (y + z*w) is 4 floating-point operations. Multiply the resulting number by the number of iterations.

How do you calculate MLP FLOPs?

To calculate the FLOPs in a model, here are the rules: Convolutions – FLOPs = 2x Number of Kernel x Kernel Shape x Output Shape. Fully Connected Layers – FLOPs = 2x Input Size x Output Size.

What is flop Pytorch?

This script is designed to compute the theoretical amount of multiply-add operations in convolutional neural networks. It can also compute the number of parameters and print per-layer computational cost of a given network.


1 Answers

Linpack, or High performance linpack, is generally the industry standard for measuring flops. I found a python implementation here, but it might not be of much use, The standard implementation (especially if you have a cluster) would be to use the HPL. Unless you want to implement your own parallel linpack in python, HPL is the way to go. This is what most of those monster super computers on the top 500 list use to measure their performance

If you're really hell bent on doing this, even though it might not make sense or be of much use, You might want to look into porting the original MPI version to 0-MQ, which has a nice python interface.

like image 107
pyCthon Avatar answered Sep 21 '22 09:09

pyCthon