Question

I have written a new version of the PBKDF2 algorithm in Haskell. It passes almost all of the HMAC-SHA-1 test vectors listed in RFC 6070, but it is not very efficient. How can I improve the code?

When I run it on the test vectors, the third case (see below) never finishes (I left it running for more than 1/2 hour on a 2010 Macbook Pro).

I believe that the foldl' is my problem. Will foldr perform better, or do I need to use mutable arrays?

{-# LANGUAGE BangPatterns #-}
{- Copyright 2013, G. Ralph Kuntz, MD. All rights reserved. LGPL License. -}

module Crypto where

import Codec.Utils (Octet)
import qualified Data.Binary as B (encode)
import Data.Bits (xor)
import qualified Data.ByteString.Lazy.Char8 as C (pack)
import qualified Data.ByteString.Lazy as L (unpack)
import Data.List (foldl')
import Data.HMAC (hmac_sha1)
import Text.Bytedump (dumpRaw)

-- Calculate the PBKDF2 as a hexadecimal string
pbkdf2
  :: ([Octet] -> [Octet] -> [Octet])  -- pseudo random function (HMAC)
  -> Int  -- hash length in bytes
  -> String  -- password
  -> String  -- salt
  -> Int  -- iterations
  -> Int  -- derived key length in bytes
  -> String
pbkdf2 prf hashLength password salt iterations keyLength =
  let
    passwordOctets = stringToOctets password
    saltOctets = stringToOctets salt
    totalBlocks =
      ceiling $ (fromIntegral keyLength :: Double) / fromIntegral hashLength
    blockIterator message acc =
      foldl' (\(a, m) _ ->
        let !m' = prf passwordOctets m
        in (zipWith xor a m', m')) (acc, message) [1..iterations]
  in
    dumpRaw $ take keyLength $ foldl' (\acc block ->
      acc ++ fst (blockIterator (saltOctets ++ intToOctets block)
                      (replicate hashLength 0))) [] [1..totalBlocks]
  where
    intToOctets :: Int -> [Octet]
    intToOctets i =
      let a = L.unpack . B.encode $ i
      in drop (length a - 4) a

    stringToOctets :: String -> [Octet]
    stringToOctets = L.unpack . C.pack

-- Calculate the PBKDF2 as a hexadecimal string using HMAC and SHA-1
pbkdf2HmacSha1
  :: String  -- password
  -> String  -- salt
  -> Int  -- iterations
  -> Int  -- derived key length in bytes
  -> String
pbkdf2HmacSha1 =
  pbkdf2 hmac_sha1 20

Third test vector

 Input:
   P = "password" (8 octets)
   S = "salt" (4 octets)
   c = 16777216
   dkLen = 20

 Output:
   DK = ee fe 3d 61 cd 4d a4 e4
        e9 94 5b 3d 6b a2 15 8c
        26 34 e9 84             (20 octets)
Was it helpful?

Solution

I was able to get it to complete in ~16 min on my MacBookPro:

% time Crypto-Main
eefe3d61cd4da4e4e9945b3d6ba2158c2634e984                          
./Crypto-Main  1027.30s user 15.34s system 100% cpu 17:22.61 total

by changing the strictness of your fold:

let
  -- ...
  blockIterator message acc = foldl' (zipWith' xor) acc ms
    where ms = take iterations . tail $ iterate (prf passwordOctets) message
          zipWith' f as bs = let cs = zipWith f as bs in sum cs `seq` cs
in
  dumpRaw $ take keyLength $ foldl' (\acc block ->
    acc ++ blockIterator (saltOctets ++ intToOctets block)
                    (replicate hashLength 0)) [] [1..totalBlocks]

Note how I force the full evaluation of each zipWith xor. In order to calculate sum cs into WHNF, we must know the exact value of each element in cs.

This prevents building up a chain of thunks, which I think your existing code was attempting to do, but failing, as foldl' only forces the accumulator into WHNF. Since your accumulator was a pair, the WHNF is just (_thunk, _another_thunk), so your intermediate thunks were not getting forced.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top