Efficient representation of functions

https://stackoverflow.com/questions/8246493

07-03-2021
|

Question

Function type A -> B in some sense is not very good. Though functions are first class values, one often cannot freely operate them due to efficiency problems. You can't apply too many transformations (A -> B) -> (C -> D), at some point you have to compute a value.

Obviously this is due to the non-strict nature of -> .

There are well know tricks to deal with functions of type Double -> Double. One can represent them as vectors given a certain basis, which can consist of trig functions, polynomials etc.

Are there any general tricks to get round the inefficiency of the A -> B type?

Or alternatives to -> ?

Solution

Your concern seems to be that given h === f • g, f • g is typically less efficient than h. Given a composition of functions known at compile time, there are two tricks performed by the compiler which can render f • g more efficient than you would suspect -- inlining, and fusion. Inlining avoids the extra indirection of a second function call, and opens up many new opportunities for optimizations. Stream fusion (or build/foldr fusion) can (to give a basic example) turn compositions such as map f . map g into map (f . g) thereby reducing the number of traversals of a structure by a constant factor. Fusion operates not only on lists, but other structures, and provides one reason for the efficient performance of Haskell libraries such as Vector.

Short cut fusion: http://www.haskell.org/haskellwiki/Correctness_of_short_cut_fusion

Stream fusion: What is Haskell's Stream Fusion

OTHER TIPS

I cannot confirm this. As a productive user and implementor of AFRP, I am performing transformations on fully polymorphic functions a lot, deeply nested and for long running applications. Note that Haskell compilers do not use the traditional stack-based function calling paradigm. They use graph reduction algorithms. We don't have the same problems as, say, C.

One of the most general tricks is memoization - storing the value of a function after you computed it. Links: Haskellwiki, SO example, MemoCombinators. As you mentioned, the other trick is when you have a nice type of function (polynomial, vector, Taylor series etc.) - then it can be represented as a list, expression etc.

FWIW: In Felix, which is a whole program analyser relying heavily on inlining for performance, function arguments have three kinds: eager, lazy, or "let the compiler decide".

Eager arguments are evaluated and assigned to variable before the body of the function is evaluated.

Lazy arguments are evaluated by replacing the parameter with the argument expression wherever it occurs.

The default is "let the compiler decide". For a large amount of "ordinary" code (whatever that means) it doesn't make any difference whether you use eager or lazy evaluation.

Generally in Felix lazy evaluation is faster: note carefully this does NOT mean closures. It means inlining. However sometimes the compiler will chose eager evaluation, it reduces code bloat, and too much inlining is counter productive. I make no claim the algorithm is any good .. however Felix can sometimes outperform C and Ocaml (GHC didn't get into the finals).

As a simple example .. type classes. Felix has typeclasses, sort of like Haskell. No or very little performance overhead .. certainly no dictionaries!

In my view, Haskell would be a lot better if you just chucked out the archaic concept of separate compilation: whole program analysers can do so much more, and text is much faster to work with than object code (given the complete freedom to cache compilation results). It's crazy to have a lazy language use a compilation model designed for eager evaluation!

The other thing a Haskell variant might try is to drop the idea all functions are lazy, and instead adopt the idea that the evaluation strategy is irrelevant, unless otherwise specified. That may allow a lot more optimisation opportunities.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow