Is it possible to tail call optimize a grouping function in f# using sequences?
-
19-04-2021 - |
Question
Here is my attempt which is NOT tail call optimized because I need to dispose the enumerator:
let Group func seed (items : seq<'t>) =
let rec some (i : IEnumerator<'t>) state = seq {
try
if i.MoveNext()
then
let newstate, iscomplete = func (i.Current) state
if iscomplete
then
yield newstate
yield! some i newstate
else
yield state
finally
i.Dispose() }
some (items.GetEnumerator ()) seed
And here is example usage:
let buffer maxBufferSize items =
Group (fun item state ->
let newstate = [ item ] |> List.append state
if newstate.Length >= maxBufferSize
then (newstate, true)
else (newstate, false)) List.empty items
If I can avoid using the enumerator (i.e. Seq.head AND Seq.tail
) I could make it work but without Seq.tail
I am at a loss. I am really hoping to make this work with generic sequences..
And FYI I know this code won't work in its current state because I would end up disposing the enumerator multiple times.
Solution
You can move the try .. finally
block from the inner some
function (where it is entered during every iteration) to the main function. Then the inner recursive function some
becomes tail-recursive:
let Group func seed (items : seq<'t>) =
// The handling of exceptions is done by the caller,
// so 'some' does not need to handle exceptions...
let rec some (i : IEnumerator<'t>) state = seq {
if i.MoveNext()
then
let newstate, iscomplete = func (i.Current) state
if iscomplete then
yield newstate
// Recursive call in tail-call position
yield! some i newstate
else
yield state }
// Return a sequence that wraps the obtains the IEnumerator
// and guarantees that it gets disposed if 'some' fails
seq {
let i = items.GetEnumerator ()
try
// This is still not-tail recursive
yield! some i seed
finally
i.Dispose() }
Or even better, you can implement the sequence that is returned from Group
using the use
construct:
seq {
use i = items.GetEnumerator ()
// This is still not-tail recursive
yield! some i seed }
Actually, I think this is more correct than your original code, because it calls the Dispose
method only once. In your version, it would be called once for every time the execution enters some
(which would depend on the number of elements processed before an exception happens).