Empty stacks from torch profiler

Question

Details of the problem

Hello, I am trying to reproduce the profiler example of the official Pytorch tutorial. I want to export stacks of a forward pass of a model.

Although, the stacks files are created and they are empty.

import torch
from torch import profiler
from torchvision.models import resnet18

model = resnet18().cuda()
inputs = torch.rand(5, 3, 224, 224).cuda()


with profiler.profile(
    activities=[profiler.ProfilerActivity.CPU,
                profiler.ProfilerActivity.CUDA],
    with_stack=True,
)as p:
    model(inputs)

p.export_stacks(
    f"/tmp/profiler/stacks_cpu.txt", "self_cpu_time_total")
p.export_stacks(
    f"/tmp/profiler/stacks_cuda.txt", "self_cuda_time_total")

Environment information

Running on Docker image pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime
torch: 2.0.0
torchvision: 0.15.0
Python: 3.10.9

Note: I reproduced it on the bare docker image.

What I tried

The very weird thing is that when I print the table from my script, I can see the trace. I give the exact snippet of code I use for that, I just put them right after the snippet of code above.

print(p.key_averages(group_by_stack_n=5).table(
    sort_by="self_cuda_time_total", row_limit=2))

Update: The printing option is also not working. It prints the table but not the stacks. With debug I can see the function _build_table in module torch.autograd.profiler_util. On Line 794, the stacks variable is an empty list. (_build_table is called on table method in code snippet above). Also, in key_averages method of the class EventList - which is called in key_averages of profiler class (used in the code snipped) - each event has an empty stacks attribute on line 298 . So question is, why the stack is not filled in those events? I will investigate furthermore.

MufasaChan · Accepted Answer

On the pytorch repo there is an issue #100253 that answers the final question of my post. I let you read the issue for more details.

In brief, there was an error on torch version 2.0.0 about the profiler. Their example is simpler: they try to profile an addition of 2 tensors. Their investigation is the same as mine, they end up with the same conclusion: the stacks is not filled by the profiler because the events has an empty stacks attribute. Their investigation is more located because they compared two version of torch (1.13.0 VS 2.0.0) and they find the number of events are not the same. The profiler's tracing is done in C++, so I cannot investigate further.

The current fix is to go back to torch 1.13.0 waiting the fix.

Edit: See Ben comment and Github, to have all info we should add experimental_config. My personal uses of it revealed some other problems, notably using the Kineto traces with HTA. But these problems are not part of the scope of this SO post.

Thanks to the person who brought this issue on torch repo and thanks to the maintainers of torch!

Empty stacks from torch profiler

Tags:

python

profiling

pytorch

Details of the problem

Environment information

What I tried

MufasaChan

1 Answers

MufasaChan

Recent Activity

Donate For Us

Empty stacks from torch profiler

Tags:

python

profiling

pytorch

Details of the problem

Environment information

What I tried

MufasaChan

1 Answers

MufasaChan

Related questions

Recent Activity

Donate For Us