
Episode 171: Thinking Parallel & C++ Forward Progress
Algorithms + Data Structures = Programs
00:00
In-depth Discussion on Reduction Operations in CUDA and CUB
Exploring the detailed mechanics of reductions in CUDA backend and CUB, highlighting the nuances of inclusive scan and reduction operations, focusing on left to right ordering and non-commutativity, and explaining how thread blocks manage input chunks and the warp wide primitive for accurate ordering.
Transcript
Play full episode