Question

Say I have two pure but unsafe functions, that do the same, but one of them is working on batches, and is asymptotically faster:

f :: Int -> Result -- takes O(1) time
f = unsafePerformIO ...
g :: [Int] -> [Result] -- takes O(log n) time
g = unsafePerformIO ...

A naive implementation:

getUntil :: Int -> [Result]
getUntil 0 = f 0
getUntil n = f n : getUntil n-1

switch is the n value where g gets cheaper than f.

getUntil will in practice be called with ever increasing n, but it might not start at 0. So since the Haskell runtime can memoize getUntil, performance will be optimal if getUntil is called with an interval lower than switch. But once the interval gets larger, this implementation is slow.

In an imperative program, I guess I would make a TreeMap (which could quickly be checked for gaps) for caching all calls. On cache misses, it would get filled with the results of g, if the gap was greater than switch in length, and f otherwise, respectively.

How can this be optimized in Haskell?

I think I am just looking for:

  • an ordered map filled on-demand using a fill function that would fill all values up to the requested index using one function if the missing range is small, another if it is large
  • a get operation on the map which returns a list of all lower values up to the requested index. This would result in a function similar to getUntil above.
Was it helpful?

Solution

I'll elaborate in my proposal for using map, after some tests I just ran.

import System.IO
import System.IO.Unsafe
import Control.Concurrent
import Control.Monad

switch :: Int
switch = 1000

f :: Int -> Int
f x = unsafePerformIO $ do
    threadDelay $ 500 * x
    putStrLn $ "Calculated from scratch: f(" ++ show x ++ ")"
    return $ 500*x

g :: Int -> Int
g x = unsafePerformIO $ do
    threadDelay $ x*x `div` 2
    putStrLn $ "Calculated from scratch: g(" ++ show x ++ ")"
    return $ x*x `div` 2

cachedFG :: [Int]
cachedFG = map g [0 .. switch] ++ map f [switch+1 ..]

main :: IO ()
main = forever $ getLine >>= print . (cachedFG !!) . read

… where f, g and switch have the same meaning indicated in the question.

The above program can be compiled as is using GHC. When executed, positive integers can be entered, followed by a newline, and the application will print some value based on the number entered by the user plus some extra indication on what values are being calculated from scratch.

A short session with this program is:

User:     10000 
Program:  Calculated from scratch: f(10000)
Program:  5000000
User:     10001
Program:  Calculated from scratch: f(10001)
Program:  5000500
User:     10000
Program:  5000000
^C

The program has to be killed/terminated manually.

Notice that the last value entered doesn't show a "calculated from scratch" message. This indicates that the program has the value cached/memoized somewhere. You can try executing this program yourself; but have into account that threadDelay's lag is proportional to the value entered.

The getUntil function then could be implemented using:

getUntil :: Int -> [Int]
getUntil n = take n cachedFG

or:

getUntil :: Int -> [Int]
getUntil = flip take cachedFG

If you don't know the value for switch, you can try evaluating f and g in parallel and use the fastest result, but that's another show.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top