Question

Trying to implement the straightforward dynamic programming algorithm for the Knapsack problem. Obviously this approach uses a lot of memory and so I am trying to optimize the memory utilized. I am simply trying to store only the previous row of my table in memory just long enough to compute the next row, and so on. At first I thought my implementation was solid, but it still ran out of memory as an implementation designed to store the whole table. So next I thought maybe I need foldl' instead of foldr, but it did not make any difference. My program continues to eat memory until my system runs out.

So I have 2 specific questions:

  1. What is it about my code that is using up all the memory? I thought I was being clever by using a fold, because I assumed only the current value of the accumulator would be stored in memory.
  2. What is the proper approach for achieving my goal; that is, storing only the most recent row in memory? I don't necessarily need code, maybe just some helpful functions and data types. More generally, what are some tips and techniques for understanding memory usage in Haskell?

Here is my implementation

data KSItem a = KSItem { ksItem :: a, ksValue :: Int, ksWeight :: Int} deriving (Eq, Show, Ord)
dynapack5 size items = finalR ! size
  where
    noItems = length items
    itemsArr = listArray(1,noItems) items
    row = listArray(1,size) (replicate size (0,[]))
    computeRow row item =
      let w = ksWeight item
          v = ksValue item
          idx = ksItem item
          pivot = let (lastVal, selections) = row ! w
                  in if v > lastVal 
                     then (v, [idx])
                     else (lastVal, selections)
          figure r c =
              if (prevVal + v) > lastVal
              then (prevVal + v, prevItems ++ [idx])
              else (lastVal, lastItems)
              where (lastVal, lastItems) = (r ! c)
                    (prevVal, prevItems) = (r ! (c - w))
          theRest = [ (figure row cw) | cw <- [(w+1)..size] ]
          newRow = (map (row!) [1..(w-1)]) ++ 
                   [pivot] ++
                   theRest
      in listArray (1,size) newRow
    finalR = foldl' computeRow row items

In my head, what I think this is doing is initializing the first row to (0,[])... repeated as necessary, then kicking off the fold where the next row is calculated based on the supplied row, and this value then becomes the accumulator. I'm not seeing where more and more memory is being consumed...

Random thought: what if i used the \\ operator on the accumulator instead?

Was it helpful?

Solution

As Tom Ellis said, using force on the array solves the space issues. However, it is extremely slow, because force traverses all the lists in the array from start to end each time it is invoked. So we should only force as needed:

let res = listArray (1,size) newRow in force (map fst $ elems res) `seq` res

This fixes the space leak and it's also pretty fast.

If you want to take space efficiency to the logical next step, you could use bitsets of the indices of the items instead of lists of items. Integers are good for the job here since they automatically resize themselves to accommodate the highest set bit. Also, with Integer-s forcing is straightforward:

import qualified Data.Vector as V -- using this instead of Array cause I like it more
import Data.List
import Control.Arrow
import Data.Bits
import Control.DeepSeq

data KSItem a = KSItem { ksItem :: a, ksValue :: Int, ksWeight :: Int} deriving (Eq, Show, Ord)

dynapack5' :: Int -> [KSItem a] -> (Int, Integer)
dynapack5' size items = V.last solutions where
    items' = [KSItem i v w | (i, KSItem _ v w) <- zip [0..] items]

    solutions = foldl' add (V.replicate (size + 1) (0, 0::Integer)) items'

    add arr (KSItem item currVal w) = force $ V.imap go arr where
        go i (v, is) | w < i && v' > v = (v', is')
                     | otherwise       = (v, is)
            where (v', is') = (+currVal) *** (`setBit` item) $ arr V.! (i - w)

OTHER TIPS

Data.Array is non-strict in its elements so even though foldl' forces it to WHNF each time around the loop the contents don't get evaluated. The simplest fix would be to import Control.DeepSeq and change

in listArray (1,size) newRow

to

in force (listArray (1,size) newRow)

This is doing more work than strictly necessary each time around the loop, but will do the job.

Unfortunately you can't just substitute unboxed arrays here, since your arrays contain a tuple containing a list.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top