Programming on Parallel Machines by Norm Matloff
This is a pdf document on the author’s web page which can be downloaded here. It is a very good read, covering the general issues around why we should use parallel programming and common performance pitfalls, but also goes on to demonstrate a number of different frameworks for writing parallel programs. The last section covers some common problems like matrix multiplication and looks at algorithms for doing them in parallel.
By way of a discussion followed by the implementation of a number of algorithms, the author demonstrates Open MP (a shared memory framework written as an extension to C), GPU programming via CUDA (a cut down C with no recursion but with multi-SIMD threads called warps scheduled by the hardware), Thrust (a framework that allows a higher level expression of programs which can then map to Open MP or CUDA), message passing systems as implemented in Open MPI, and cloud computing via the map-reduce as implemented in Hadoop. I have read whole books on Open MP before, but here the author picks just the key concepts which are demonstrated with their benefits and problems discussed.
This was a great read to get a good feel of the available technologies.