Is there any way I can execute validation_step method on single GPU while training_step with multiple GPU using DDP.
The reason I want to do is because there are several metrics which I want to implement which requires complete access to the data, and running on single GPU will ensure that. I have tried validation_step_end method but somehow I am only getting part of the data. That post is here: Stack Overflow Post
I am afraid that this is not possible. But there is the TorchMetrics package which has been developed with multi-GPU support in mind so when your custom metric is derived from TM you shall be able to get running even on your multi-GPU setting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With