This week one of my work colleagues asked if I had noticed the resurgence of interest in super-compilation. I’d come across partial evaluation in the past… the idea that you can evaluate large chunks of some programs at runtime, basically by using some kind of evaluator that works on the chunks of program that depend on fixed data. There were many applications of this technique to Scheme in the past. I’m also subscribed to a number of Haskell mailing lists, where I’ve seen the work of Mitchell mentioned repeatedly.
At the weekend, I therefore downloaded the paper Supercompilation by evaluation from Simon Peyton Jones’ recent papers collection. This paper is a really interesting read. The two authors have developed a very effective supercompiler. Most interesting to me was that they have derived it from an initial operational semantics for a language, rather than just taking a load of ad hoc techniques and putting them together. They use memoization to prevent the evaluator doing lots of repeated work, and go into some detail of the well order that they use to ensure that the supercompiler terminates.
Their supercompiler also deals with let bound recursive functions, something that previous supercompilers have found difficult. The key here is that when the split phase is carried out to focus the compiler’s attention on a sub-term, certain state needs to be passed down the recursive branch. You must be careful to avoid passing too little data into the recursive call, or potential optimisation is lost, and too much information can lead to invalid optimisations. The paper does a good job of explaining the difficulties.
The authors postulate that their techniques could be applied to call-by-value languages and it will be really interesting to see if this claim is true. In a language like C#, side effects happen more frequently and I wonder if this would remove some of the potential gains of partial evaluation. I’m also interested in a comparison with modern JIT architectures, where the system can find the hot paths and spend effort optimising those paths, in contrast to whole program optimisation techniques.
I also downloaded the paper Seq no more which discusses some improvements to the GHC implementation which supports the par and pseq functions that have been added to allow parallelism.
par x y suggests to the compiler that a spark (a GHC runtime task) should be launched to start the evaluation of x while the computation continues with the evaluation of y which is the result of the par function.
pseq ensures that x is evaluated before the system continues to evaluate y, and adds evaluation order into a pure language.
The paper discusses some of the implications of the spark pool, in particular whether the elements in the pool should be taken as strong or weak roots by the runtime system, and how the system should deal with fizzled sparks – sparks representing computations that have already been evaluated by someone else. In a call-by-need language these fizzled sparks are an administrative overhead. By doing some reworking, the resulting system can treat sparks as weak roots, which helps to prevent the processors becoming overwhelmed by tasks.
The main part of the paper is concerned with an improved definition of strategies. A strategy is a specification of how something should be evaluated with respect to the use of additional sparks, and the paper ends up with something that looks like an embedded DSL for specifying these strategies. A system library can provide a number specifications that work with many built-in datatypes. For example, a strategy for evaluating lists that will evaluate the first n elements in parallel while keeping the evaluation of the rest of the elements lazy.
Both papers are full of clever ideas and are well worth a read. Like all Peyton Jones papers there are good practical results all based on solid theory.