How to implement continuations?

https://stackoverflow.com/questions/6512

08-06-2019
|

Question

I'm working on a Scheme interpreter written in C. Currently it uses the C runtime stack as its own stack, which is presenting a minor problem with implementing continuations. My current solution is manual copying of the C stack to the heap then copying it back when needed. Aside from not being standard C, this solution is hardly ideal.

What is the simplest way to implement continuations for Scheme in C?

Solution

I remember reading an article that may be of help to you: Cheney on the M.T.A. :-)

Some implementations of Scheme I know of, such as SISC, allocate their call frames on the heap.

@ollie: You don't need to do the hoisting if all your call frames are on the heap. There's a tradeoff in performance, of course: the time to hoist, versus the overhead required to allocate all frames on the heap. Maybe it should be a tunable runtime parameter in the interpreter. :-P

OTHER TIPS

A good summary is available in Implementation Strategies for First-Class Continuations, an article by Clinger, Hartheimer, and Ost. I recommend looking at Chez Scheme's implementation in particular.

Stack copying isn't that complex and there are a number of well-understood techniques available to improve performance. Using heap-allocated frames is also fairly simple, but you make a tradeoff of creating overhead for "normal" situation where you aren't using explicit continuations.

If you convert input code to continuation passing style (CPS) then you can get away with eliminating the stack altogether. However, while CPS is elegant it adds another processing step in the front end and requires additional optimization to overcome certain performance implications.

If you are starting from scratch, you really should look in to Continuation Passing Style (CPS) transformation.

Good sources include "LISP in small pieces" and Marc Feeley's Scheme in 90 minutes presentation.

It seems Dybvig's thesis is unmentioned so far. It is a delight to read. The heap based model is the easiest to implement, but the stack based is more efficient. Ignore the string based model.

R. Kent Dybvig. "Three Implementation Models for Scheme". http://www.cs.indiana.edu/~dyb/papers/3imp.pdf

Also check out the implementation papers on ReadScheme.org. http://library.readscheme.org/page8.html

The abstract is as follows:

This dissertation presents three implementation models for the Scheme Program- ming Language. The rst is a heap-based model used in some form in most Scheme implementations to date; the second is a new stack-based model that is considerably more ecient than the heap-based model at executing most programs; and the third is a new string-based model intended for use in a multiple-processor implementation of Scheme.

The heap-based model allocates several important data structures in a heap, including actual parameter lists, binding environments, and call frames.

The stack-based model allocates these same structures on a stack whenever possible. This results in less heap allocation, fewer memory references, shorter instruction sequences, less garbage collection, and more ecient use of memory.

The string-based model allocates versions of these structures right in the program text, which is represented as a string of symbols. In the string-based model, Scheme programs are translated into an FFP language designed specically to support Scheme. Programs in this language are directly executed by the FFP machine, a multiple-processor string-reduction computer.

The stack-based model is of immediate practical benet; it is the model used by the author's Chez Scheme system, a high-performance implementation of Scheme. The string-based model will be useful for providing Scheme as a high-level alternative to FFP on the FFP machine once the machine is realized.

Besides the nice answers you've got so far, I recommend Andrew Appel's Compiling with Continuations. It's very well written and while not dealing directly with C, it is a source of really nice ideas for compiler writers.

The Chicken Wiki also has pages that you'll find very interesting, such as internal structure and compilation process (where CPS is explained with an actual example of compilation).

Examples that you can look at are: Chicken (a Scheme implementation, written in C that support continuations); Paul Graham's On Lisp - where he creates a CPS transformer to implement a subset of continuations in Common Lisp; and Weblocks - a continuation based web framework, which also implements a limited form of continuations in Common Lisp.

Continuations aren't the problem: you can implement those with regular higher-order functions using CPS. The issue with naive stack allocation is that tail calls are never optimised, which means you can't be scheme.

The best current approach to mapping scheme's spaghetti stack onto the stack is using trampolines: essentially extra infrastructure to handle non-C-like calls and exits from procedures. See Trampolined Style (ps).

There's some code illustrating both of these ideas.

The traditional way is to use setjmp and longjmp, though there are caveats.

Here's a reasonably good explanation

Continuations basically consist of the saved state of the stack and CPU registers at the point of context switches. At the very least you don't have to copy the entire stack to the heap when switching, you could only redirect the stack pointer.

Continuations are trivially implemented using fibers. http://en.wikipedia.org/wiki/Fiber_%28computer_science%29 . The only things that need careful encapsulation are parameter passing and return values.

In Windows fibers are done using the CreateFiber/SwitchToFiber family of calls. in Posix-compliant systems it can be done with makecontext/swapcontext.

boost::coroutine has a working implementation of coroutines for C++ that can serve as a reference point for implementation.

Use an explicit stack instead.

Patrick is correct, the only way you can really do this is to use an explicit stack in your interpreter, and hoist the appropriate segment of stack into the heap when you need to convert to a continuation.

This is basically the same as what is needed to support closures in languages that support them (closures and continuations being somewhat related).

As soegaard pointed out, the main reference remains this one

The idea is, a continuation is a closure that keeps its evaluation control stack. The control stack is required in order to continue the evalution from the moment the continuation was created using call/cc.

Oftenly invoking the continuation makes long time of execution and fills the memory with duplicated stacks. I wrote this stupid code to prove that, in mit-scheme it makes the scheme crash,

The code sums the first 1000 numbers 1+2+3+...+1000.

(call-with-current-continuation 
 (lambda (break)
   ((lambda (s) (s s 1000 break))
    (lambda (s n cc)
      (if (= 0 n)
          (cc 0)
          (+ n
             ;; non-tail-recursive,
             ;; the stack grows at each recursive call
             (call-with-current-continuation
              (lambda (__)
                (s s (- n 1) __)))))))))

If you switch from 1000 to 100 000 the code will spend 2 seconds, and if you grow the input number it will crash.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow