This is obv waaay late but I thought I'd post anyways for future readers' benefit (I imagine OP is long done with this problem).
TL;DR:
I think we probably want to use the Data.Vector
package for this problem (and similar types of problems).
Longer version:
According to the Haskell docs, a Map
(from Data.Map
) has O(log N) access time whereas a Vector
(from Data.Vector
) has O(1) access; we can see the difference in the results below: the vector implementation runs ~3x faster. (Both are way better than lists which have O(N) access time.)
A couple of benchmarks are included below. The tests were intentionally not run one after another so as to prevent any cache-based optimization.
A couple of observations:
The largest absolute improvement (from the code in the original post) was due to the addition of type signatures; without being explicitly told that the data was of type Int
, Haskell's type system was inferring that the data was of type Integer
(which is obv bigger and slower)
A bit counterintuitive but, results are virtually indistinguishable between foldl1'
and foldl1
. (I double checked the code and ran these a few times just to make sure.)
Vector
and Array
(and, to a certain extent, Map
) allow for decent improvement primarily as a result of memoization. (Note that OP's solution is likely a lot faster than a list-based solution that tried to use memoization given lists' O(N) access time.)
Here are a couple of benchmarks (all compiled using O2
):
Probably want to look
at these numbers
|
V
Data.Vector 0.35s user 0.10s system 97% cpu 0.468 total
Data.Array (Haskell.org) 0.31s user 0.21s system 98% cpu 0.530 total
Data.Map (above answer) 1.31s user 0.46s system 99% cpu 1.772 total
Control.Parallel (Haskell.org) 1.75s user 0.05s system 99% cpu 1.799 total
OP (`Int` type sigs + foldl') 3.60s user 0.06s system 99% cpu 3.674 total
OP (`Int` type sigs) 3.53s user 0.16s system 99% cpu 3.709 total
OP (verbatim) 3.36s user 4.77s system 99% cpu 8.146 total
Source of figures from Haskell.org: https://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14
The Data.Vector
implementation used to generate the above results:
import Data.Vector ( Vector, fromList, maxIndex, (!) )
main :: IO ()
main = putStrLn $ show $ largestCollatz 1000000
largestCollatz :: Int -> Int
largestCollatz n = maxIndex vector
where
vector :: Vector Int
vector = fromList $ 0 : 1 : [collatz x x 0 | x <- [2..n]]
collatz m i c =
case i < m of
True -> c + vector ! i
False -> let j = if even i then i `div` 2 else 3*i + 1
in collatz m j (c+1)