Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct way to measure the total execution time for a pytorch function running on GPU?

The following is an example code showing what I am trying to measure. Here I am using time.perf_counter() to measure time. Is this the correct way to measure execution time in this scenario? If not, what is the correct way? My concern is, GPU evaluations are asynchronous and GPU execution might not be completed when ExecTime is measured below.

import torch
import torch.nn.functional as F
import time

Device = torch.device("cuda:0")
ProblemSize = 100
NumChannels = 5
NumFilters = 96
ClassType = torch.float32

X = torch.rand(1, NumChannels, ProblemSize, ProblemSize, dtype=ClassType).to(Device)
weights = torch.rand(NumFilters, NumChannels, 10, 10, dtype=ClassType).to(Device)
    
#warm up
Y = F.conv2d(X, weights)
Y = F.conv2d(X, weights)

#time
t = time.perf_counter()
Y = F.conv2d(X, weights)
ExecTime = time.perf_counter() - t

like image 737
ahsabali Avatar asked Jan 23 '26 09:01

ahsabali


1 Answers

I think you are looking for pyotrch's bottleneck profiler.

like image 138
Shai Avatar answered Jan 26 '26 01:01

Shai



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!