by Miriam LeeserNortheastern University
Heterogeneous and homogeneous multicore processor architectures (NVIDIA Fermi, AMD Radeon, Intel Sandy Bridge, AMD Fusion) have emerged demonstrating significant increases in throughput in scientific applications over traditional single core processors. Each of these new processing elements varies widely in their processing capabilities, performance, memory systems, development environments, programming languages and debugging tools. In this rapidly increasing design space, programming for these platforms has become much more complex, error prone, and architecture dependent. Designing portable high-performance applications that can function properly across widely varied systems has become paramount. This work extends the Tasks and Conduits framework (originally developed at MIT Lincoln Laboratory) to support GPUs and heterogeneous platforms using NVIDIA CUDA and OpenCL. Running an application of Monte Carlo simulations of photon propagation, we have achieved 22x speedup porting the application from a single CPU core to a GPU with a change of only 5 source lines of code (SLOC) in addition to the GPU kernel.
Miriam Leeser is a Professor at Northeastern University, Department of Electrical and Computer Engineering. She is head of the Reconfigurable Logic and GPU Computing Laboratory at Northeastern and director of the Center for Communications and Digital Signal Processing. Her research interests include programming paradigms for manycore computers, computer arithmetic and applications including signal and image processing.