Question

;; compute the max of a list of integers

(define Y
  (lambda (w)
    ((lambda (f)
       (f f))
     (lambda (f)
       (w (lambda (x)
            ((f f) x)))))))

((Y
  (lambda (max)
    (lambda (l)
      (cond ((null? l) -1)
            ((> (car l) (max (cdr l))) (car l))
            (else (max (cdr l)))))))
 '(1 2 3 4 5))

I wish to understand this construction. Can somebody give a clear and simple explanation for this code?

For example, supposing that I forget the formula of Y. How can I remember it , and reproduce it long after I work with it ?

Was it helpful?

Solution

Here's some related answers (by me):

Basically, with Y defined as λr.(λh.h h) (λg.r (λx.(g g) x)), an application Y r reduces as

Y r
(λw.(λh.h h) (λg.w (λx.(g g) x))) r
(λh.h h) (λg.r (λx.(g g) x))
h h
    ;where
        h = (λg.r (λx.(g g) x))       <----\
                                           |
(λg.r (λx.(g g) x)) h                      |
r (λx.(g g) x)             <-------------- | ----------\
    ;where                                 |           |
        g = h                         -----/           |
        ;so that                                       |
        (g g) = (h h) = r (λx.(g g) x)           ------/

So r must expect two arguments - first representing the recursive function to be called, and second - an actual argument:

        r = λf (λx. ....x.....(f y)...... )

so that (Y r) x reduces as

(r (λx.(g g) x)) x
(r f) x
    ;where
        f   = (λx.(g g) x) 
        f y = (λx.(g g) x) y = (g g) y = (r f) y  ; f is "fixed point" of r

The definiton f = (λx.(g g) x) means, when f y is called, (g g) y will be called, at which point g will be self-applied, r "pulled" from inside g and the result of (r f) called with y argument. I.e. any call (f y) in the body of lambda expression resulting from (r f) application, is translated back to (r f) y i.e. invocation of same body with a new argument y.

The important implementational detail is whether it is the same function body, or its copy, but the semantics are the same - we are able to enter the same function body with a new argument value.

The essence of Y combinator is replication through reference and self-application: we refer to the same thing through same name, twice; and thus we arrange for it to receive itself as an argument.

When there's no referencing, as in pure lambda calculus, and parameters receive textual copies of arguments - i.e. reduction is done by textual rewriting - this still works, because same copies get replicated and passed around, being fed as argument to self so it is available on the next iteration, if need be.

But it is much more efficient when shared referencing is available (all uses of same name refer to same thing). Under environment model of evaluation creation of self-referential function is simple as

(let ((fact #f)) 
  (set! fact 
        (lambda (n) (if (< 2 n) 1 
                               (* n (fact (- n 1)))))) 
  fact)

In fact the definition in your answer is that of applicative-order Y combinator. With normal-order, eta-reduction can be applied without causing infinite looping, to get Ynorm = (λw.(λh.h h) (λg.w (g g))) which is canonically written as

Ynorm = (λf.(λx.f (x x)) (λx.f (x x)))

indeed

Ynorm g
= (λx.g (x x)) (λx.g (x x))
= g ((λx.g (x x)) (λx.g (x x)))

OTHER TIPS

The best explanation I've found so far it's in the book "The Little Schemer", chapter 9. The whole chapter explains step-by-step how the Y-Combinator works, and how to derive the combinator starting from an arbitrary recursive procedure.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top