How to generate random, typed functions

Question 1

DarkOtter's comment mentions QuickCheck's Arbitrary and CoArbitrary classes, which are certainly the first thing you should try. QuickCheck has this instance:

instance (CoArbitrary a, Arbitrary b) => Arbitrary (a -> b) where ...

As it happens, I was just yesterday reading the QuickCheck code to understand how this works, so I can just share what I learned while it's fresh in my mind. QuickCheck is built around a type that looks like this (and this won't be exactly the same):

type Size = Int

-- | A generator for random values of type @a@.
newtype Gen a = 
    MkGen { -- | Generate a random @a@ using the given randomness source and
            -- size. 
            unGen :: StdGen -> Size -> a 
          }

class Arbitrary a where
    arbitrary :: a -> Gen a

The first trick is that QuickCheck has a function that works like this (and I didn't work out exactly how it's implemented):

-- | Use the given 'Int' to \"perturb\" the generator, i.e., to make a new
-- generator that produces different pseudorandom results than the original.
variant :: Int -> Gen a -> Gen a

Then they use this to implement various instances of this CoArbitrary class:

class CoArbitrary a where
    -- | Use the given `a` to perturb some generator.
    coarbitrary :: a -> Gen b -> Gen b

-- Example instance: we just treat each 'Bool' value as an 'Int' to perturb with.
instance CoArbitrary Bool where
    coarbitrary False = variant 0
    coarbitrary True = variant 1

Now with these pieces in place, we want this:

instance (Coarbitrary a, Arbitrary b) => Arbitrary (a -> b) where
    arbitrary = ...

I won't write out the implementation, but the idea is this:

Using the CoArbitrary instance of a and the Arbitrary instance of b we can make the function \a -> coarbitrary a arbitrary, which has type a -> Gen b.
Remember that Gen b is a newtype for StdGen -> Size -> b, so the type a -> Gen b is isomorphic to a -> StdGen -> Size -> b.
We can trivially write a function that takes any function of that latter type and switches the argument order around to return a function of type StdGen -> Size -> a -> b.
This rearranged type is isomorphic to Gen (a -> b), so voilà, we pack the rearranged function into a Gen, and we got our random function generator!

I would recommend that you read the source of QuickCheck to see this for yourself. When you tackle that, you're only going to run into two extra details that might slow you down. First, the Haskell RandomGen class has this method:

-- | The split operation allows one to obtain two distinct random generators.
split :: RandomGen g => g -> (g, g)

This operation is used in the Monad instance for Gen, and is rather important. One of the tricks here is that the StdGen is a pure pseudo random number generator; the way Gen (a -> b) works is that for each possible value of a we perturb a b generator, use that perturbed generator to generate the b result, but then we never advance the perturbed generator's state; basically the generated a -> b function is a closure over a pseudo-random seed, and each time we call it with some a we use that specific a to deterministically create a new seed, and then use that to deterministically generate a b that depends on a and the hidden seed.

The abbreviated type Seed -> a -> b more or less sums up what's going on—a pseudo-random function is a rule for generating a b from a pseudo-random seed and an a. This won't work with imperative-style stateful random number generators.

Second: instead of directly having a (a -> StdGen -> Size -> b) -> StdGen -> Size -> a -> b function as I describe above, the QuickCheck code has promote :: Monad m => m (Gen a) -> Gen (m a), which is the generalization of that to any Monad. When m is the function instance of Monad, promote coincides with (a -> Gen b) -> Gen (a -> b), so it's really the same as I sketch above.

Question 2

Thanks for the very thorough answers above! None of the responses, quite did what I was looking for though. I followed up on DarkOtter's suggestion in the comment the question, and used unsafeCoerce avoid the type checker. The basic idea is that we create a GADT that packages up Haskell functions with their types; the type system I use follows pretty closely Mark P. Jones' "Typing Haskell in Haskell." Whenever I want a collection of Haskell functions, I first coerce them into Any types, then I do what I need to do, stitching them together randomly. When I go to evaluate the new functions, first I coerce them back to the type I wanted. Of course, this isn't safe; if my type checker is wrong or I annotate the haskell functions with incorrect types, then I end up with nonsense.

I've pasted the code I tested this with below. Note that there are two local modules being imported Strappy.Type and Strappy.Utils. The first is the type system mentioned above. The second brings in helpers for the stochastic programs.

Note: in the code below I'm using the combinatory logic as the basic language. That's why my expression language only has application and no variables or lambda abstraction.

{-# Language GADTs,  ScopedTypeVariables   #-}

import Prelude hiding (flip)
import qualified  Data.List as List
import Unsafe.Coerce (unsafeCoerce) 
import GHC.Prim
import Control.Monad
import Control.Monad.State
import Control.Monad.Trans
import Control.Monad.Identity
import Control.Monad.Random

import Strappy.Type
import Strappy.Utils (flip)


-- | Helper for turning a Haskell type to Any. 
mkAny :: a -> Any
mkAny x = unsafeCoerce x 


-- | Main data type. Holds primitive functions (Term), their
-- application (App) and annotations.
data Expr a where
    Term :: {eName  :: String, 
             eType  :: Type, 
             eThing :: a} -> Expr a
    App  :: {eLeft  :: (Expr (b -> a)),
             eRight :: (Expr b),
             eType  :: Type}         ->  Expr a 

-- | smart constructor for applications
a <> b = App a b (fst . runIdentity . runTI $ typeOfApp a b)

instance Show (Expr a)   where
    show Term{eName=s} = s
    show App{eLeft=el, eRight=er} = "(" ++ show el ++ " " ++  show er ++ ")"



-- | Return the resulting type of an application. Run's type
-- unification.
typeOfApp :: Monad m => Expr a -> Expr b -> TypeInference  m Type
typeOfApp e_left e_right 
    = do t <- newTVar Star 
         case mgu (eType e_left) (eType e_right ->- t) of 
           (Just sub) -> return $ toType (apply sub (eType e_left))
           Nothing -> error $ "typeOfApp: cannot unify " ++
                      show e_left ++ ":: " ++ show (eType e_left) 
                               ++ " with " ++ 
                      show e_right ++ ":: " ++ show (eType e_right ->- t) 

eval :: Expr a -> a
eval Term{eThing=f} = f
eval App{eLeft=el, eRight=er} = (eval el) (eval er)

filterExprsByType :: [Any] -> Type -> TypeInference [] Any
filterExprsByType (e:es) t  
    = do et <- freshInst (eType (unsafeCoerce e :: Expr a))
         let e' = unsafeCoerce e :: Expr a
         case mgu et t of
           Just sub -> do let eOut = unsafeCoerce e'{eType = apply sub et} :: Any
                          return eOut `mplus` rest
           Nothing -> rest
      where rest = filterExprsByType es t
filterExprsByType [] t = lift []


----------------------------------------------------------------------
-- Library of functions

data Library = Library { probOfApp :: Double, -- ^ probability of an expansion
                         libFunctions :: [Any] }

cInt2Expr :: Int -> Expr Int
-- | Convert numbers to expressions. 
cInt2Expr i = Term (show i) tInt i 


--  Some basic library entires. 
t = mkTVar 0                  
t1 = mkTVar 1                  
t2 = mkTVar 2                  
t3 = mkTVar 3                  

cI = Term "I" (t ->- t) id
cS = Term "S" (((t2 ->- t1 ->- t) ->- (t2 ->- t1) ->- t2 ->- t)) $ \f g x -> (f x) (g x)
cB = Term "B" ((t1 ->- t) ->- (t2 ->- t1) ->- t2 ->- t) $ \f g x -> f (g x)
cC = Term "C" ((t2 ->- t1 ->- t2 ->- t) ->- t1 ->- t2 ->- t) $ \f g x -> (f x) g x
cTimes :: Expr (Int -> Int -> Int)
cTimes = Term "*" (tInt ->- tInt ->- tInt) (*)
cPlus :: Expr (Int -> Int -> Int)
cPlus = Term "+" (tInt ->- tInt ->- tInt) (+)
cCons = Term ":"  (t ->- TAp tList t ->- TAp tList t)  (:)
cAppend = Term "++" (TAp tList t ->- TAp tList t ->- TAp tList t) (++)
cHead = Term "head" (TAp tList t ->- t) head
cMap = Term "map" ((t ->- t1) ->- TAp tList t ->- TAp tList t1) map
cEmpty = Term "[]" (TAp tList t) []
cSingle = Term "single" (t ->- TAp tList t) $ \x -> [x]
cRep = Term "rep" (tInt ->- t ->- TAp tList t) $ \n x -> take n (repeat x)
cFoldl = Term "foldl" ((t ->- t1 ->- t) ->- t ->- (TAp tList t1) ->- t) $ List.foldl'
cNums =  [cInt2Expr i | i <- [1..10]]

--  A basic library

exprs :: [Any]
exprs = [mkAny cI, 
         mkAny cS, 
         mkAny cB, 
         mkAny cC, 
         mkAny cTimes, 
         mkAny cCons, 
         mkAny cEmpty,
         mkAny cAppend,
--         mkAny cHead,
         mkAny cMap,
         mkAny cFoldl,
         mkAny cSingle,
         mkAny cRep
        ] 
        ++ map mkAny cNums

library = Library 0.3 exprs


-- | Initializing a TypeInference monad with a Library. We need to
-- grab all type variables in the library and make sure that the type
-- variable counter in the state of the TypeInference monad is greater
-- that that counter.
initializeTI :: Monad m => Library -> TypeInference m ()
initializeTI Library{libFunctions=es} = do put (i + 1)
                                           return ()
    where go n (expr:rest) = let tvs = getTVars (unsafeCoerce expr :: Expr a)
                                 getTVars expr = tv . eType $ expr
                                 m = maximum $ map (readId . tyVarId) tvs 
                             in if null tvs then 0 else go (max n m) rest
          go n [] = n
          i = go 0 es


----------------------------------------------------------------------
----------------------------------------------------------------------
-- Main functions. 
sampleFromExprs :: (MonadPlus m, MonadRandom m) =>
                   Library -> Type -> TypeInference  m (Expr a)
-- | Samples a combinator of type t from a stochastic grammar G. 
sampleFromExprs lib@Library{probOfApp=prApp, libFunctions=exprs} tp 
    = do initializeTI lib
         tp' <- freshInst tp
         sample tp'
    where sample tp = do
            shouldExpand <- flip prApp
            case shouldExpand of
              True -> do t <- newTVar Star
                         (e_left :: Expr (b -> a))  <- unsafeCoerce $ sample (t ->- tp)
                         (e_right :: Expr b) <- unsafeCoerce $ sample (fromType (eType e_left))
                         return $ e_left <> e_right -- return application
              False -> do let cs = map fst . runTI $ filterExprsByType exprs tp
                          guard (not . null $ cs) 
                          i <- getRandomR (0, length cs - 1)
                          return $ unsafeCoerce (cs !! i) 

----------------------------------------------------------------------
----------------------------------------------------------------------

main = replicateM 100 $ 
       do let out =  runTI $ do sampleFromExprs library (TAp tList tInt) 
          x <- catch (liftM (Just . fst)  out)
                     (\_ -> putStrLn "error" >> return Nothing)                       
          case x of 
            Just y  -> putStrLn $ show x ++ " " ++ show (unsafeCoerce (eval y) :: [Int])
            Nothing  -> putStrLn ""

Question 3

Would something along these lines meet your needs?

import Control.Monad.Random

randomFunction :: (RandomGen r, Random a, Num a, Floating a) => Rand r (a -> a)
randomFunction = do
  (a:b:c:d:_) <- getRandoms
  fromList [(\x -> a + b*x, 1), (\x -> a - c*x, 1), (\x -> sin (a*x), 1)]
    -- Add more functions as needed

main = do
  let f = evalRand randomFunction (mkStdGen 1) :: Double -> Double
  putStrLn . show $ f 7.3

EDIT: Building on that idea, we could incorporate functions that have different numbers and types of parameters... as long as we partially apply them so that they all have the same result type.

import Control.Monad.Random

type Value = (Int, Double, String) -- add more as needed

type Function = Value -> String -- or whatever the result type is

f1 :: Int -> Int -> (Int, a, b) -> Int
f1 a b (x, _, _) = a*x + b

f2 :: String -> (a, b, String) -> String
f2 s (_, _, t) = s ++ t

f3 :: Double -> (a, Double, b) -> Double
f3 a (_, x, _) = sin (a*x)

randomFunction :: RandomGen r => Rand r Function
randomFunction = do
  (a:b:c:d:_) <- getRandoms -- some integers
  (w:x:y:z:_) <- getRandoms -- some floats
  n <- getRandomR (0,100)
  cs <- getRandoms -- some characters
  let s = take n cs 
  fromList [(show . f1 a b, 1), (show . f2 s, 1), (show . f3 w, 1)]
    -- Add more functions as needed

main = do
  f <- evalRandIO randomFunction :: IO Function
  g <- evalRandIO randomFunction :: IO Function
  h <- evalRandIO randomFunction :: IO Function
  putStrLn . show $ f (3, 7.3, "hello")
  putStrLn . show $ g (3, 7.3, "hello")
  putStrLn . show $ h (3, 7.3, "hello")