More fun with applicative functors

Question 1

The <* and *> functions are very simple: they work the same way as >>. The <* would work the same way as << except << does not exist. Basically, given a *> b, you first "do" a, then you "do" b and return the result of b. For a <* b, you still first "do" a then "do" b, but you return the result of a. (For appropriate meanings of "do", of course.)

The <$ function is just fmap const. So a <$ b is equal to fmap (const a) b. You just throw away the result of an "action" and return a constant value instead. The Control.Monad function void, which has a type Functor f => f a -> f () could be written as () <$.

These three functions are not fundamental to the definition of an applicative functor. (<$, in fact, works for any functor.) This, again, is just like >> for monads. I believe they're in the class to make it easier to optimize them for specific instances.

When you use applicative functors, you do not "extract" the value from the functor. In a monad, this is what >>= does, and what foo <- ... desugars to. Instead, you pass the wrapped values into a function directly using <$> and <*>. So you could rewrite your example as:

foo <$> parser1 <*> parser2 <*> parser3 ...

If you want intermediate variables, you could just use a let statement:

let var1 = parser1
    var2 = parser2
    var3 = parser3 in
foo <$> var1 <*> var2 <*> var3

As you correctly surmised, pure is just another name for return. So, to make the shared structure more obvious, we can rewrite this as:

pure foo <*> parser1 <*> parser2 <*> parser3

I hope this clarifies things.

Now just a little note. People do recommend using applicative functor functions for parsing. However, you should only use them if they make more sense! For sufficiently complex things, the monad version (especially with do-notation) can actually be clearer. The reason people recommend this is that

foo <$> parser1 <*> parser2 <*> parser3

is both shorter and more readable than

do var1 <- parser1
   var2 <- parser2
   var3 <- parser3
   return $ foo var1 var2 var3

Essentially, the f <$> a <*> b <*> c is essentially like lifted function application. You can imagine the <*> being a replacement for a space (e.g. function application) in the same way that fmap is a replacement for function application. This should also give you an intuitive notion of why we use <$>--it's like a lifted version of $.

Question 2

I can make a few remarks here, hopefully helpful. This reflects my understanding which itself might be wrong.

pure is unusually named. Usually functions are named referring to what they produce, but in pure x it is x that is pure. pure x produces an applicative functor which "carries" the pure x. "Carries" of course is approximate. An example: pure 1 :: ZipList Int is a ZipList, carrying a pure Int value, 1.

<*>, *>, and <* are not functions, but methods (this answers your first concern). f in their types is not general (like it would be, for functions) but specific, as specified by a specific instance. That's why they are indeed not just $, flip const and const. The specialized type f specifies the semantics of combination. In the usual applicative style programming, combination means application. But with functors, an additional dimension is present, represented by the "carrier" type f. In f x, there is a "contents", x, but there is also a "context", f.

The "applicative functors" style sought to enable the "applicative style" programming, with effects. Effects being represented by functors, carriers, providers of context; "applicative" referring to the normal applicative style of functional application. Writing just f x to denote application was once a revolutionary idea. There was no need for additional syntax anymore, no (funcall f x), no CALL statements, none of this extra stuff - combination was application... Not so, with effects, seemingly - there was again that need for the special syntax, when programming with effects. The slain beast reappeared again.

So came the Applicative Programming with Effects to again make the combination mean just application - in the special (perhaps effectful) context, if they were indeed in such context. So for a :: f (t -> r) and b :: f t, the (almost plain) combination a <*> b is an application of carried contents (or types t -> r and t), in a given context (of type f).

The main distinction from monads is, monads are non-linear. In

do {  x        <-  a
   ;     y     <-  b x
   ;        z  <-  c x y
   ;               return 
     (x, y, z) }

the computation b x depends on x, and c x y depends on both x and y. The functions are nested:

a >>= (\x ->  b x  >>= (\y ->  c x y  >>= (\z ->  .... )))

If b and c do not depend on the previous results (x, y), this can be made flat by making the computation stages return repackaged, compound data (this addresses your second concern):

a  >>= (\x       ->  b  >>= (\y-> return (x,y)))       -- `b  ` sic
   >>= (\(x,y)   ->  c  >>= (\z-> return (x,y,z)))     -- `c  `
   >>= (\(x,y,z) ->  ..... )

and this is essentially an applicative style (b, c are fully known in advance, independent of the value x produced by a, etc.). So when your combinations create data that encompass all the information they need for further combinations, and there's no need for "outer variables" (i.e. all computations are already fully known, independent of any values produced by any of the previous stages), you can use this style of combination.

But if your monadic chain has branches dependent on values of such "outer" variables (i.e. results of previous stages of monadic computation), then you can't make a linear chain out of it. It is essentially monadic then.

As an illustration, the first example from that paper shows how the "monadic" function

sequence :: [IO a] → IO [a]
sequence [ ] = return [ ]
sequence (c : cs) = do
  {  x       <-  c
  ;      xs  <-  sequence cs  -- `sequence cs` fully known, independent of `x`
  ;              return 
    (x : xs) }

can actually be coded in this "flat, linear" style as

sequence :: (Applicative f) => [f a] -> f [a]
sequence []       = pure []
sequence (c : cs) = pure (:) <*> c <*> sequence cs
                  --     (:)     x     xs

There's no use here for the monad's ability to branch on previous results.

a note on the excellent Petr Pudlák's answer: in my "terminology" here, his pair is combination without application. It shows that the essence of what the Applictive Functors add to plain Functors, is the ability to combine. Application is then achieved by the good old fmap. This suggests combinatory functors as perhaps a better name (update: in fact, "Monoidal Functors" is the name).

Question 3

You can view functors, applicatives and monads like this: They all carry a kind of "effect" and a "value". (Note that the terms "effect" and "value" are only approximations - there doesn't actually need to be any side effects or values - like in Identity or Const.)

With Functor you can modify possible values inside using fmap, but you cannot do anything with effects inside.
With Applicative, you can create a value without any effect with pure, and you can sequence effects and combine their values inside. But the effects and values are separate: When sequencing effects, an effect cannot depend on the value of a previous one. This is reflected in <*, <*> and *>: They sequence effects and combine their values, but you cannot examine the values inside in any way.

You could define Applicative using this alternative set of functions:
```
fmap     :: (a -> b) -> (f a -> f b)
pureUnit :: f ()
pair     :: f a -> f b -> f (a, b)
-- or even with a more suggestive type  (f a, f b) -> f (a, b)
```
(where pureUnit doesn't carry any effect) and define pure and <*> from them (and vice versa). Here pair sequences two effects and remembers the values of both of them. This definition expresses the fact that Applicative is a monoidal functor.

Now consider an arbitrary (finite) expression consisting of pair, fmap, pureUnit and some primitive applicative values. We have several rules we can use:
```
fmap f . fmap g           ==>     fmap (f . g)
pair (fmap f x) y         ==>     fmap (\(a,b) -> (f a, b)) (pair x y)
pair x (fmap f y)         ==>     -- similar
pair pureUnit y           ==>     fmap (\b -> ((), b)) y
pair x pureUnit           ==>     -- similar
pair (pair x y) z         ==>     pair x (pair y z)
```
Using these rules, we can reorder pairs, push fmaps outwards and eliminate pureUnits, so eventually such expression can be converted into
```
fmap pureFunction (x1 `pair` x2 `pair` ... `pair` xn)
```
or
```
fmap pureFunction pureUnit
```
So indeed, we can first collect all effects together using pair and then modify the resulting value inside using a pure function.
With Monad, an effect can depend on the value of a previous monadic value. This makes them so powerful.

Question 4

The answers already given are excellent, but there's one small(ish) point I'd like to spell out explicitly, and it has to do with <*, <$ and *>.

One of the examples was

do var1 <- parser1
   var2 <- parser2
   var3 <- parser3
   return $ foo var1 var2 var3

which can also be written as foo <$> parser1 <*> parser2 <*> parser3.

Suppose that the value of var2 is irrelevant for foo - e.g. it's just some separating whitespace. Then it also doesn't make sense to have foo accept this whitespace only to ignore it. In this case foo should have two parameters, not three. Using do-notation, you can write this as:

do var1 <- parser1
   parser2
   var3 <- parser3
   return $ foo var1 var3

If you wanted to write this using only <$> and <*> it should be something like one of these equivalent expressions:

(\x _ z -> foo x z) <$> parser1 <*> parser2 <*> parser3
(\x _ -> foo x) <$> parser1 <*> parser2 <*> parser3
(\x -> const (foo x)) <$> parser1 <*> parser2 <*> parser3
(const  . foo) <$> parser1 <*> parser2 <*> parser3

But that's kind of tricky to get right with more arguments!

However, you can also write foo <$> parser1 <* parser2 <*> parser3. You could call foo the semantic function which is fed the result of parser1 and parser3 while ignoring the result of parser2 in between. The absence of > is meant to be indicative of the ignoring.

If you wanted to ignore the result of parser1 but use the other two results, you can similarly write foo <$ parser1 <*> parser2 <*> parser3, using <$ instead of <$>.

I've never found much use for *>, I would normally write id <$ p1 <*> p2 for the parser that ignores the result of p1 and just parses with p2; you could write this as p1 *> p2 but that increases the cognitive load for readers of the code.

I've learnt this way of thinking just for parsers, but it has later been generalised to Applicatives; but I think this notation comes from the uuparsing library; at least I used it at Utrecht 10+ years ago.

Question 5

I'd like to add/reword a couple things to the very helpful existing answers:

Applicatives are "static". In pure f <*> a <*> b, b does not depend on a, and so can be analyzed statically. This is what I was trying to show in my answer to your previous question (but I guess I failed -- sorry) -- that since there was actually no sequential dependence of parsers, there was no need for monads.

The key difference that monads bring to the table is (>>=) :: Monad m => m a -> (a -> m b) -> m a, or, alternatively, join :: Monad m => m (m a). Note that whenever you have x <- y inside do notation, you're using >>=. These say that monads allow you to use a value "inside" a monad to produce a new monad, "dynamically". This cannot be done with an Applicative. Examples:

-- parse two in a row of the same character
char             >>= \c1 ->
char             >>= \c2 ->
guard (c1 == c2) >>
return c1

-- parse a digit followed by a number of chars equal to that digit
--   assuming: 1) `digit`s value is an Int,
--             2) there's a `manyN` combinator
-- examples:  "3abcdef"  -> Just {rest: "def", chars: "abc"}
--            "14abcdef" -> Nothing
digit        >>= \d -> 
manyN d char 
-- note how the value from the first parser is pumped into 
--   creating the second parser

-- creating 'half' of a cartesian product
[1 .. 10] >>= \x ->
[1 .. x]  >>= \y ->
return (x, y)

Lastly, Applicatives enable lifted function application as mentioned by @WillNess. To try to get an idea of what the "intermediate" results look like, you can look at the parallels between normal and lifted function application. Assuming add2 = (+) :: Int -> Int -> Int:

-- normal function application
add2 :: Int -> Int -> Int
add2 3 :: Int -> Int
(add2 3) 4 :: Int

-- lifted function application
pure add2 :: [] (Int -> Int -> Int)
pure add2 <*> pure 3 :: [] (Int -> Int)
pure add2 <*> pure 3 <*> pure 4 :: [] Int

-- more useful example
[(+1), (*2)]
[(+1), (*2)] <*> [1 .. 5]
[(+1), (*2)] <*> [1 .. 5] <*> [3 .. 8]

Unfortunately, you can't meaningfully print the result of pure add2 <*> pure 3 for the same reason that you can't for add2 ... frustrating. You may also want to look at the Identity and its typeclass instances to get a handle on Applicatives.