Translate imperative control flow with break-s/continue-s to haskell

https://stackoverflow.com/questions/4512541

12-10-2019
|

Question

Consider the following imperative code which finds the largest palindrome among products of 3-digit numbers (yes, it's the one of the first tasks from "Project of [outstanding mathematician of 18th century]" site):

curmax = 0
for i in range(999,100):
for j in range(999,100):
    if ((i*j) < curmax): break
    if (pal(i*j)):
        curmax = i*j
        break
print curmax

As I'm learning Haskell currently, my question is, how do you translate this (and basically any imperative construct that contains something more complex than just plain iteration, e.g. breaks, continues, temporary variables and all this) to Haskell?

My version is

maxpal i curmax
    | i < 100 = curmax
    | otherwise = maxpal (i-1) (innerloop 999)
    where 
        innerloop j
            | (j < 100) || (p < curmax) = curmax
            | pal p = p
            | otherwise = innerloop (j-1)
            where p = i*j
main = print $ maxpal 999 0

but this looks like we're still in imperative uglytown.

So what could you advise, what are the approaches of dealing with such cases FP-style?

Solution

If we do away with all the optimization and just multiply all combinations of numbers between 100 and 999, filter out the non-palindromes and take the maximum of that, we can write the function very concisely as:

maximum $ filter pal [x*y | x <- [100..999], y <- [100..999]]

Of course this is basically the least efficient way to do it, but since the numbers are relatively small, this still finishes in under half a second on my machine.

However if we want something that is more along the lines of your python solution algorithmically, we can do it like this:

import Data.Maybe
import Data.List

maxpal i curmax
    | i < 100 = curmax
    | otherwise = maxpal (i-1) newmax
    where newmax = fromMaybe curmax (find pal bigger)
          bigger = takeWhile (> curmax) (map (*i) [999, 998 ..])

Here the outer loop is basically the same as in your solution, but we replaced the inner loop using list functions instead.

We're using map (*i) [999, 998, ...] to create the product i*j for every j counting down from 999. Using takeWhile we're saying that the list should stop once a value is not greater than curmax.

Then we're using find to see whether any item in that list is a palindrome. If it is, the first palindrome in the list is our new max. If it isn't we keep our old max. (find returns a Maybe and fromMaybe takes a default value and a Maybe and returns the value from the Maybe or the default value if there is no value in the Maybe)

OTHER TIPS

Similar answer to Daniel's and sepp2k's:

Lazy functional programming lets you write programs in a much more modular way than you see in imperative control flow like the one in your question. For instance, form the list of factors 999...100, then all products, then filter to retain only the palindromes, and then compute the maximum. Thanks to laziness, these intermediate lists will come into being only as needed and will be incrementally recycled.

For more explanation and examples, see John Hughes's classic paper Why Functional Programming Matters.

maxpal :: Int
maxpal = maximum [i*j | i <- factors, j <- factors, pal (i*j) ]

factors :: [Int]
factors = [999,998..100]

pal :: Show a => a -> Bool
pal = palL . show

palL :: (Eq a) => [a] -> Bool
palL xs = xs == reverse xs

In my mind a range corresponds to a list. For example:

f = [999,998..100]

Now f is defined as the sequence of numbers from 999 to 100.

for loops correspond to different functional concepts, depending on what you're doing in each iteration. Sometimes a map is the appropriate analog, sometimes a fold, sometimes something else. Often times it's a combination of things. In this case, you're effectively combining two lists. One way to do that in Haskell is a list comprehension:

g = [(x * y) | x <- f , y <- f]

Here g represents a list of the product of each element of the previously defined sequence combined with itself. In other words, pretty much what you've got going on within your for loop.

From here, you'd probably want to filter the resulting sequence to contain only values that are palindromes, and then calculate the maximum value from that set.

There's no one-size-fits-all answer here. But let's walk through this specific example:

First, consider the outer loop: We always do the full range, and we only care about the final maximum, so that's easy enough:

outerLoop = foldl innerLoop 0 [999,998..100]

In the inner loop, we have some value of i and a current maximum. Now we only care about the range where i*j is larger than the current maximum:

innerLoop curmax i = foldr checkMax curmax [999*i, 998*i .. curmax]

In the core logic we get a value for i*j which we know will always be greater then or equal to the current maximum, so all that's necessary is to check the next value to see if it's a palindrome: If so, we're done, because the sequence decreases. If not, defer the decision:

checkMax ij defer = if pal ij then ij else defer

So, thinking functionally you should be looking at ways to break your problem not into loops and steps but into functions.

So, if we had a function maxWhere f xs which returned the largest x for which f x is true, we could write:

maxpal = maxWhere pal [x * y | x <- [999,998..100], y <- [999,998..100]]

A naive implementation of maxWhere is

maxWhere f xs = maximum $ filter f xs

but this is bad if f is more expensive than comparison as we'll be making more calls to f than in the original. We can use fold to combine the filter and the maximum into a single pass and get the same behaviour as the imperative code.

maxWhere f xs = foldl' r 0 xs
    where r a x
       | x > a     = if f x then x else a
       | otherwise = a

The use of zero as a magic small number here is horrible, but works in that case.

(I really want to spell that list of candidate numbers (*) <$> [999,998..100] <*> [999,998..100], but that may be introducing an unnecessary complication here.)

Gah. Beaten by sepp2k, but I'll answer your general question:

Temporary variables can also be expressed using the state monad, or ST monad if you have a lot of them. FP often wins in succintness and clarity, but in some cases it doesn't, e.g. when there are several local variables to juggle.

Laziness can emulate many breaks, but when dealing with IO, you generally have to use explicit recursion. However, the 'List' package (from Hackage) is rather clever at allowing you to write IO loops in a functional style.

this kind of loop lends itself easily to a list comprehension, like this:

maximum [x*y | x <- [999..100], y <- [999..100],isPalindrome (x*y)]

Where we might write isPalindrome like this:

isPalindrome x = xs == reverse xs
  where xs = show x

This is really fast enough, although sort of unsmartypantsy, so first off we will notice that we are checking the numbers twice. Let's say a*b is the biggest palindrome, then we will check both the case where x == a, y==b, and x==b, y==a. So first we stop this by restricting the numbers we search through to only the cases where x >= y, like this:

maximum [x*y | x <- [999..100], y <- [x..100],isPalindrome (x*y)]

This cuts the numbers to test in half.

In your python solution you also bound y below by the biggest number we have found so far divided by the current x (x*y => curmax), also you never search beyond the first y found (breaking the inner loop if curmax is updated). We may cut the search further by not continuing if the first element we check (x squared) is less then our current answer, since all subsequent checks are smaller, but this is beyond what looks good in a list comprehension so we move our search into it's own function:

import Data.List(find)
import Data.Maybe(isNothing,fromJust)

search x curr 
   | x * x < curr                   = curr
   | isNothing maypal || pal < curr = search (x - 1) curr 
   | otherwise                      = search (x - 1) pal 
   where maypal = find isPalindrome [x * x, (x - 1) * x .. curr]
         pal    = fromJust maypal

It's worth noticing how our limitation, (x*x) < curr, really just means that from now on, [x*x,(x-1)*x..curr] is going to be empty. As you can see, all of the bounds that were enforced by breaks in your python code fits inside one iteration on x (using recursion) and a find on a list of x*y values. It might not look nicer, but it seems to me to state more explicitly the restrictions we make on x and y.

Running it we get:

*Main> search 999 0
906609

Turns out that stopping when x * x < curr is a really good idea since the square root of 906609 is 952...

As noted by stephen tetley in his comment, in FP you can use continuation passing style to handle complex control flow (Cont monad plus its callCC which is somehow similar to break. ...or even goto - abuse of CPS can lead to rather incomprehensible code - see my example below):

import Control.Monad.Cont

pal n = sn == reverse sn
    where sn = show n

range = [99999,99998..10000]

mfoldM a r f = foldM f a r  

curmaxm = (`runCont` id) $ mfoldM 0 range $ \m i ->
            callCC $ \break ->
                mfoldM m range $ \m j -> do
                  let ij = i*j
                  if ij < m
                     then break m
                     else return $
                          if pal ij then ij else m

Two mfoldM's (just a standard foldM with its arguments rearranged) correspond to two loops in the original sample and break function-argument is used in "inner loop" to exit it once (i*j > current max) condition is violated (returning the current max as a result of that "inner loop"). Here we need to escape from just one "loop level", so callCC here is definitely overkill.

The same logic can also be implementing with find (+ laziness of Haskell):

import Data.List
import Data.Maybe
import Control.Monad

curmax = fromJust $ foldM it 0 range
    where 
      it m i = (find pal . takeWhile (>m) . map (*i) $ range) `mplus` return m

find pal here returns either the first palindrome number (which will also satisfy (>m) condition in takeWhile) or Nothing (zero of MonadPlus) and after mplus (or Alternatice.<|>) it effectively returns either a new maximum palindrome or the previous max (return m). Since find stops searching once the first satisfying element is found, this code behave exactly as its imperative curmax analog. Both version run for [99999..10000] range in 0.5 second.

Update: Just for fun: same approach but using StateT Integer (Cont Integer) () - Cont to escape from the "loop" and State to pass max palindrome around (plus an ability to use forM_ and when). Same efficiency:

import Control.Monad.Cont
import Control.Monad.State.Strict

solcs = runCont (execStateT comp 0) id
    where   
      comp = forM_ range $ \i -> callCC $ \break ->
                forM_ range $ \j -> do
                  let ij = i*j
                  m <- get
                  when (ij < m) (break ())
                  when (pal ij) (put ij)

I think you can do what you want using two mutually recursive functions.

Here's a much simpler example (taken from a tutorial on ATS):

implement main (argc, argv) = let
  fun loop1 (i: int): void =
    if i <= 9 then loop2 (i, i) else ()

  and loop2  (i: int, j: int): void =
    if j <= 9 then begin
      if i < j then begin
        print ", ";
        print "("; print i; print ", "; print j; print ")";
        loop2 (i, j+1)
      end
    end else begin
      print_newline ();
      loop1 (i+1)
    end
  in
    loop1 0
  end

The code written above is very much like what you'd write in C (taken from that same page):

int main (int argc, char *argv[]) { int i, j ;

for (i = 0; i <= 9; i += 1) {
  for (j = i; j <= 9; j += 1) {
    if (i < j) printf (", ") ; printf ("(%i, %i)", i, j) ;
  } /* for */
  printf ("\n") ;
} /* for */

return 0 ;

}

As you see, nested loops become mutually recursive functions; and mutable variables i and j become induction variables. loop1 corresponds to the outer loop, whereas loop2 to the inner loop.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow