How to handle nested structure when traversing with state monad

https://stackoverflow.com/questions/13921215

10-12-2021
|

Question

I have a nested structures which I'm converting to XML using a scalaz state monad. This works well until I have to deal with multi-level nested structures. Here is a simplified example similar to what I'm doing. Given the following ADT:

sealed trait Nested
case object Leaf extends Nested
case class Foo(inner: Nested) extends Nested
case class Bar(inner: Nested) extends Nested

I write a converter object using the state monad (assume Scalaz7 and the following imports import scalaz.{Node => _, _}; import Scalaz._; import scala.xml._):

case class Parents(foos: Int, bars: Int)
type ParentsS[X] = State[Parents, X]

def convertFoo(foo: Foo): ParentsS[Seq[Node]] = for {
  parents <- init[Parents]
  _ <- put[Parents](Parents(parents.foos + 1, parents.bars))
  inner <- convert(foo.inner)
  _ <- put[Parents](parents)
} yield <foo count={ parents.foos.toString }/>.copy(child=inner)

def convertBar(bar: Bar): ParentsS[Seq[Node]] = for {
  parents <- init[Parents]
  _ <- put[Parents](Parents(parents.foos, parents.bars + 1))
  inner <- convert(bar.inner)
  _ <- put[Parents](parents)
} yield <bar count={ parents.bars.toString }/>.copy(child=inner)

def convert(nested: Nested): ParentsS[Seq[Node]] = nested match {
  case Leaf => Seq[Node]().point[ParentsS]
  case foo@Foo(_) => convertFoo(foo)
  case bar@Bar(_) => convertBar(bar)
}

def nested(n: Int): Nested =
  if (n == 0) Leaf
  else {
    if (n % 2 == 0) Bar(nested(n - 1))
    else Foo(nested(n - 1))
  }

Depending on my stack settings, convert(nested(1000)).apply(Parents(0, 0)) will cause a stack overflow in the conversion process. (higher values will cause nested to overflow, but this can be ignored since I just created nested for this question.):

    at scalaz.IdInstances$$anon$1.bind(Id.scala:20)
    at scalaz.StateT$$anonfun$flatMap$1.apply(StateT.scala:48)
    at scalaz.StateT$$anon$7.apply(StateT.scala:72)
    at scalaz.StateT$$anonfun$flatMap$1.apply(StateT.scala:48)
    at scalaz.StateT$$anon$7.apply(StateT.scala:72)
    at scalaz.StateT$$anonfun$flatMap$1$$anonfun$apply$2.apply(StateT.scala:49)
    at scalaz.StateT$$anonfun$flatMap$1$$anonfun$apply$2.apply(StateT.scala:48)

My question is - what's the best way to avoid the stack overflow in scalaz.stateT? I would like to keep using the state monad as in my real example if makes the XML serialization logic easier to follow and troubleshoot (the actual input structures are JDI mirrored objects and arrays retrieved from live debugging sessions and the inner values are nested fields values).

Edit: to take out the nested stack issue:

import util.control.TailCalls
def nested2(n: Int, acc: Nested = Leaf): TailCalls.TailRec[Nested] =
  if (n == 0) TailCalls.done(acc)
  else TailCalls.tailcall(nested2(n - 1, if (n % 2 == 0) Bar(acc) else Foo(acc)))

Solution

Trampolining can help you avoid the stack overflow here. First for the same setup:

sealed trait Nested
case object Leaf extends Nested
case class Foo(inner: Nested) extends Nested
case class Bar(inner: Nested) extends Nested

import scalaz.{Node => _, _}; import Scalaz._;
import scala.util.control.TailCalls, scala.xml._

case class Parents(foos: Int, bars: Int)

def nested(n: Int, acc: Nested = Leaf): TailCalls.TailRec[Nested] =
  if (n == 0) TailCalls.done(acc) else TailCalls.tailcall(
    nested(n - 1, if (n % 2 == 0) Bar(acc) else Foo(acc))
  )

Some slightly different type aliases:

type TrampolinedState[S, A] = StateT[Free.Trampoline, S, A]
type ParentsS[A] = TrampolinedState[Parents, A]

We'll import our MonadState instance's methods for the sake of convenience:

val monadState = MonadState[TrampolinedState, Parents]
import monadState._

And the rest is actually a little more concise, since we don't need the type parameters on put, etc.:

def convertFoo(foo: Foo): ParentsS[Seq[Node]] = for {
  parents <- init
  _ <- put(Parents(parents.foos + 1, parents.bars))
  inner <- convert(foo.inner)
  _ <- put(parents)
} yield <foo count={ parents.foos.toString }/>.copy(child=inner)

def convertBar(bar: Bar): ParentsS[Seq[Node]] = for {
  parents <- init
  _ <- put(Parents(parents.foos, parents.bars + 1))
  inner <- convert(bar.inner)
  _ <- put(parents)
} yield <bar count={ parents.bars.toString }/>.copy(child=inner)

def convert(nested: Nested): ParentsS[Seq[Node]] = nested match {
  case Leaf => Seq[Node]().point[ParentsS]
  case foo@Foo(_) => convertFoo(foo)
  case bar@Bar(_) => convertBar(bar)
}

Now we just run the following (for example):

convert(nested(2000).result).apply(Parents(0, 0)).run

This works far past the point that the vanilla State solution started choking on my machine.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow