What is the correct way to measure the total execution time for a pytorch function running on GPU?

Question

The following is an example code showing what I am trying to measure. Here I am using time.perf_counter() to measure time. Is this the correct way to measure execution time in this scenario? If not, what is the correct way? My concern is, GPU evaluations are asynchronous and GPU execution might not be completed when ExecTime is measured below.

import torch
import torch.nn.functional as F
import time

Device = torch.device("cuda:0")
ProblemSize = 100
NumChannels = 5
NumFilters = 96
ClassType = torch.float32

X = torch.rand(1, NumChannels, ProblemSize, ProblemSize, dtype=ClassType).to(Device)
weights = torch.rand(NumFilters, NumChannels, 10, 10, dtype=ClassType).to(Device)
    
#warm up
Y = F.conv2d(X, weights)
Y = F.conv2d(X, weights)

#time
t = time.perf_counter()
Y = F.conv2d(X, weights)
ExecTime = time.perf_counter() - t

Shai · Accepted Answer

I think you are looking for pyotrch's bottleneck profiler.

What is the correct way to measure the total execution time for a pytorch function running on GPU?

Tags:

python-3.x

pytorch

ahsabali

1 Answers

Shai

Recent Activity

Donate For Us

What is the correct way to measure the total execution time for a pytorch function running on GPU?

Tags:

python-3.x

pytorch

ahsabali

1 Answers

Shai

Related questions

Recent Activity

Donate For Us