First I want to say that I don't have much experience with pytorch, ML, NLP and other related topics, so I may confuse some concepts. Sorry.
I downloaded few models from Hugging Face, organized them in one Python script and started to perform benchmark to get overview of performance. During benchmark I monitored CPU usage and saw that only 50% of CPU was used. I have 8 vCPU, but only 4 of them are loaded at 100% at the same time. The load is jumping, i.e. there may be cores 1, 3, 5, 7 that are loaded at 100%, then cores 2, 4, 6, 8 that are loaded at 100%. But in total CPU load never raises above 50%, it also never goes below 50%. This 50% load is constant.
After quick googling I found parallelism doc. I called get_num_threads() and get_num_interop_threads() and output was 4 for both calls. Only 50% of available CPU cores which kind of explains why CPU load was at 50%.
Then I called set_num_threads(8) and set_num_interop_threads(8), and then performed benchmark. CPU usage was at constant 100%. In general performance was a bit faster, but some models started to work a bit slowly than at 50% of CPU.
So I wonder why pytorch by default uses only half of CPU? It is optimal and recommended way? Should I manually call set_num_threads() and set_num_interop_threads() with all available CPU cores if I want to achieve best performance?
Edit.
I made an additional benchmarks:
So thank you to Phoenix's answer, I think it is completely reasonable to use pytorch default settings which sets number of threads according to number of physical (not virtual) cores.
Edit.
pytorch documentation about this - https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html
PyTorch typically uses the number of physical CPU cores as the default number of threads. This means:
torch.get_num_threads() and torch.get_num_interop_threads() typically return the number of physical CPU cores.
torch.set_num_threads() and torch.set_num_interop_threads().For example:
import torch
# Get current number of threads
num_threads = torch.get_num_threads()
print(f"Current number of threads: {num_threads}")
# Set custom number of threads (e.g., equal to physical cores)
torch.set_num_threads(num_threads)
torch.set_num_interop_threads(num_threads)
# Check new settings
print(f"New number of threads: {torch.get_num_threads()}")
print(f"New number of inter-op threads: {torch.get_num_interop_threads()}")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With