How to properly use GHC's SPECIALIZE pragma? (Example: specializing pure function from monadic ones using Identity.)

StackOverflow https://stackoverflow.com/questions/12992481

Question

As an example, suppose I want to write a monadic and non-monadic map over lists. I'll start with the monadic one:

import Control.Monad
import Control.Monad.Identity

mapM' :: (Monad m) => (a -> m b) -> ([a] -> m [b])
mapM' _ [] = return []
mapM' f (x:xs) = liftM2 (:) (f x) (mapM f xs)

Now I want to reuse the code to write the pure map (instead of repeating the code):

map' :: (a -> b) -> ([a] -> [b])
map' f = runIdentity . mapM' (Identity . f)

What is necessary to make map' as optimized as if it were written explicitly like map is? In particular:

  1. Is it necessary to write

    {-# SPECIALIZE mapM' :: (a -> Identity b) -> ([a] -> Identity [b]) #-}
    

    or does GHC optimize map' itself (by factoring out Identity completely)?

  2. Anything else (more pragmas) need to be added?

  3. How can I verify how well the compiled map' is optimized wrt the explicitly written code for map?
Was it helpful?

Solution

Well, let us ask the compiler itself.

Compiling the module

module PMap where

import Control.Monad
import Control.Monad.Identity

mapM' :: (Monad m) => (a -> m b) -> ([a] -> m [b])
mapM' _ [] = return []
mapM' f (x:xs) = liftM2 (:) (f x) (mapM f xs)

map' :: (a -> b) -> ([a] -> [b])
map' f = runIdentity . mapM' (Identity . f)

with ghc -O2 -ddump-simpl -ddump-to-file PMap.hs (ghc-7.6.1, 7.4.2 produces the same except for unique names) produces the following core for map'

PMap.map'
  :: forall a_afB b_afC. (a_afB -> b_afC) -> [a_afB] -> [b_afC]
[GblId,
 Arity=2,
 Caf=NoCafRefs,
 Str=DmdType LS,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True,
         ConLike=True, WorkFree=True, Expandable=True,
         Guidance=IF_ARGS [60 30] 160 40}]
PMap.map' =
  \ (@ a_c) (@ b_d) (f_afK :: a_c -> b_d) (eta_B1 :: [a_c]) ->
    case eta_B1 of _ {
      [] -> GHC.Types.[] @ b_d;
      : x_afH xs_afI ->
        GHC.Types.:
          @ b_d
          (f_afK x_afH)
          (letrec {
             go_ahZ [Occ=LoopBreaker]
               :: [a_c] -> Data.Functor.Identity.Identity [b_d]
             [LclId, Arity=1, Str=DmdType S]
             go_ahZ =
               \ (ds_ai0 :: [a_c]) ->
                 case ds_ai0 of _ {
                   [] ->
                     (GHC.Types.[] @ b_d)
                     `cast` (Sym <(Data.Functor.Identity.NTCo:Identity <[b_d]>)>
                             :: [b_d] ~# Data.Functor.Identity.Identity [b_d]);
                   : y_ai5 ys_ai6 ->
                     (GHC.Types.:
                        @ b_d
                        (f_afK y_ai5)
                        ((go_ahZ ys_ai6)
                         `cast` (<Data.Functor.Identity.NTCo:Identity <[b_d]>>
                                 :: Data.Functor.Identity.Identity [b_d] ~# [b_d])))
                     `cast` (Sym <(Data.Functor.Identity.NTCo:Identity <[b_d]>)>
                             :: [b_d] ~# Data.Functor.Identity.Identity [b_d])
                 }; } in
           (go_ahZ xs_afI)
           `cast` (<Data.Functor.Identity.NTCo:Identity <[b_d]>>
                   :: Data.Functor.Identity.Identity [b_d] ~# [b_d]))
    }

Yup, only casts, no real overhead. You get a local worker go that acts exactly as map does.

Summing up: You only need -O2, and you can verify how well optimised the code is by looking at the core (-ddump-simpl) or, if you can read it, at the produced assembly (-ddump-asm) resp LLVM bit code -ddump-llvm).

It is probably good to elaborate a bit. Concerning

Is it necessary to write

{-# SPECIALIZE mapM' :: (a -> Identity b) -> ([a] -> Identity [b]) #-}

or does GHC optimize map' itself (by factoring out Identity completely)?

the answer is that if you use the specialisation in the same module as the general function is defined, then in general you don't need a {-# SPECIALISE #-} pragma, GHC creates the specialisation on its own if it sees any benefit in that. In the above module, GHC created the specialisation rule

"SPEC PMap.mapM' [Data.Functor.Identity.Identity]" [ALWAYS]
    forall (@ a_abG)
           (@ b_abH)
           ($dMonad_sdL :: GHC.Base.Monad Data.Functor.Identity.Identity).
      PMap.mapM' @ Data.Functor.Identity.Identity
                 @ a_abG
                 @ b_abH
                 $dMonad_sdL
      = PMap.mapM'_$smapM' @ a_abG @ b_abH

that also benefits any uses of mapM' at the Identity monad outside the defining module (if compiled with optimisations, and the monad is recognised as Identity in time for the rule to fire).

However, if GHC doesn't understand the type to specialise to well enough, it may not see any benefit and not specialise (I don't know it well enough to tell whether it will try anyway - so far I have found a specialisation each time I looked).

If you want to be sure, look at the core.

If you need the specialisation in a different module, GHC has no reason to specialise the function when it compiles the defining module, so in that case a pragma is necessary. Instead of a {-# SPECIALISE #-} pragma demanding a specialisation for a few hand-picked types, it is probably better - as of ghc-7 - to use an {-# INLINABLE #-} pragma, so that the (slightly modified) source code is made accessible in importing modules, which allows specialisations for any required types there.

Anything else (more pragmas) need to be added?

Different uses may of course require different pragmas, but as a rule of thumb, {#- INLINABLE #-} is the one you want most. And of course {-# RULES #-} can do magic the compiler cannot do on its own.

How can I verify how well the compiled map' is optimized wrt the explicitly written code for map?

  • Look at the produced core, asm, or llvm bitcode, whichever you understand best (core is relatively easy).
  • Benchmark the produced code against a hand-written specialisation, if you are not sure from the core, and need to know. Ultimately, unless you get identical intermediate results at some stage (core/cmm/asm/llvm), benchmarking is the only way to know for sure.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top