From K20 different streams becomes fully concurrent(used to be concurrent on the edge).
However My program need the old way. Or I need to do a lot of synchronization to solve the dependency problem.
Is it possible to switch stream management to the old way?
CUDA C Programming Guide section on Asynchronous Current Execution
A stream is a sequence of commands (possibly issued by different host threads) that execute in order. Different streams, on the other hand, may execute their commands out of order with respect to one another or concurrently; this behavior is not guaranteed and should therefore not be relied upon for correctness (e.g., inter-kernel communication is undefined).
If the application relied on Compute Capability 2.* and 3.0 implementation of streams then the program violates the definition of streams and any change to the CUDA driver (e.g. queuing of per stream requests) or new hardware will break the program.
If you need a temporary workaround then I would suggest moving all work to a single user defined stream. This may impact performance but it is likely the only temporary workaround.
Can you express the kernel dependencies with cudaEvent_t
objects?
The Streams and Concurrency Webinar shows some quick code snippets on how to use events. Some of the details of that presentation are only applicable to pre-Kepler hardware, but I'm assuming from the original question that you're familiar with how things have changed since Fermi now that there are multiple command queues.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With