efficiency with functional programming

Question 1

Let me rub the sleep from my eyes. You want to walk the list from either end, recording the first time you see each element, right?

scala> val data = List(2,3,4,5,2,3)
data: List[Int] = List(2, 3, 4, 5, 2, 3)

scala> val is = ((data take (data.size / 2)), (data drop (data.size / 2)).reverse).zipped
is: scala.runtime.Tuple2Zipped[Int,List[Int],Int,List[Int]] = scala.runtime.Tuple2Zipped@72cf531c

scala> .toList
res0: List[(Int, Int)] = List((2,3), (3,2), (4,5))

scala> ((Map.empty[Int,Int], data.to[Set], 0) /: is) { case ((m, n, i), (x, y)) =>
     | if (n.isEmpty) (m, n, i+1)
     | else (
     | m ++ List(if (m contains x) None else Some(x -> i), if (m contains y) None else Some(y -> i)).flatten,
     | n -- Set(x,y),
     | i + 1
     | )}
res1: (scala.collection.immutable.Map[Int,Int], Set[Int], Int) = (Map(2 -> 0, 3 -> 0, 4 -> 2, 5 -> 2),Set(),3)

scala> ._1
res2: scala.collection.immutable.Map[Int,Int] = Map(2 -> 0, 3 -> 0, 4 -> 2, 5 -> 2)

Using a Vector and a view to index the two halves would be better. And constructing the set of elements is extraneous, but it would be handy if you already knew the domain.

Another swing:

scala> val data = List(2,3,4,5,2,3).to[Seq]
data: Seq[Int] = Vector(2, 3, 4, 5, 2, 3)

scala> val half = data.size / 2
half: Int = 3

scala> val vs = (data.view take half, (data.view drop half).reverse).zipped
vs: scala.runtime.Tuple2Zipped[Int,scala.collection.SeqView[Int,Seq[Int]],Int,scala.collection.SeqView[Int,Seq[Int]]] = scala.runtime.Tuple2Zipped@72cf531c

scala> import collection.mutable
import collection.mutable

scala> val x = 4 // some key to exclude
x: Int = 4

scala> ((mutable.Map.empty[Int,Int].withDefaultValue(Int.MaxValue), 0) /: vs) {
     | case ((m, i), (x, y)) => m(x) = m(x) min i; m(y) = m(y) min i; (m, i+1) }
res4: (scala.collection.mutable.Map[Int,Int], Int) = (Map(2 -> 0, 5 -> 2, 4 -> 2, 3 -> 0),3)

scala> ._1.filter { case (k, v) => k != x }.toMap
res5: scala.collection.immutable.Map[Int,Int] = Map(2 -> 0, 5 -> 2, 3 -> 0)

I'm not sure yet whether the views are forced by the fold, so a loop with indexing might be better. And didn't SO used to scroll long lines horizontally instead of wrapping? The code is unreadable this way.

Question 2

First off, let me say that the way you are using List -- in the mutable version -- is terrible. List has lousy performance for indexed access, which you use a lot. For indexed access, use a Vector instead. Or an Array, since it's mutable anyway.

On the immutable version, you also use length, which is O(n) for List, on every iteration. Simply calling length once outside the loop and saving that would help increase performance. You also do this:

List(w.head._2+1,data.length - w.last._2).min

which is kind of slow compared to simply

(w.head._2+1) min (data.length - w.last._2)

And, of course, you should either change the data structure to a Vector or replace data.length with something assigned only once.

Now, I can see two ways of going about it. One is to walk the map two ways and get the minimum, as you did, and the other is to walk it only once like som snytt did. For the first, you really need to change the type to a Vector. The second will work fine with a List.

Let's start with the first one, which is closer to what you did. I'm striving for no mutability at all here, just as an exercise. In practice, I'd probably use a var of immutable Map instead of recursion.

def dogTimes(data: IndexedSeq[Int], typeSum: Int): Map[Int, Int] = {
  import scala.annotation.tailrec

  val unwantedKey = typeSum / 2
  val end = data.length
  val halfway = end / 2

  @tailrec
  def forward(result: Map[Int, Int], i: Int): Map[Int, Int] = {
    if (i > halfway) result
    else if (data(i) == unwantedKey) forward(result, i + 1)
    else if (result contains data(i)) forward(result, i + 1)
    else forward(result updated (data(i), i + 1), i + 1)
  }

  @tailrec
  def backward(result: Map[Int, Int], i: Int): Map[Int, Int] = {
    println(s"$i ${data(i)} $result")
    if (i < halfway) result
    else if (data(i) == unwantedKey) backward(result, i - 1)
    else if (result contains data(i)) backward(result updated (data(i), result(data(i)) min (end - i)), i - 1)
    else backward(result updated (data(i), end - i), i - 1)
  }

  // forward has to be computed first
  val fwd = forward(Map.empty[Int, Int], 0)
  val bwd = backward(fwd, end - 1)

  bwd
}

This is pretty much a functional version of your mutable code -- it's verbose and doesn't really use any of the collection methods to help the work. It could also be simplified a bit -- for example, the data.length % 2 is unnecessary, since the code inside it will always work, whether data.length is even or odd. And the contains tests could also be removed by using getOrElse on the updates.

It also returns a standard map, not one with a default. You can add the default afterwards.

The other way would be more or less som snytt's solution, but I'd rather make it a bit simpler, due to the fact that min is not necessary in that solution. Here, I accept Seq which will work for List.

def dogTimes(data: Seq[Int], typeSum: Int): Map[Int, Int] = {
  import scala.annotation.tailrec

  val unwantedKey = typeSum / 2
  val half = data.length / 2 + 1
  val vs = (data.view take half zip data.view.reverse).zipWithIndex

  val result = vs.foldLeft(Map.empty[Int, Int]) {
    case (map, ((x, y), i)) =>
      val m1 = if (map.contains(x) || x == unwantedKey) map else map.updated(x, i + 1)
      if (m1.contains(y) || y == unwantedKey) m1 else m1.updated(y, i + 1)
  }

  result
}

I kept som snytt's view, but I suspect its performance on reverse is going to be pretty bad for List. It should be ok for Vector, but I think removing the second view call ought to make this faster for List.

Note that I do not use min in this code, and the reason is simple: since I'm travelling from lowest index to highest index for forward and backward at the same time, whenever a key is in the map, I know it has to have an index lower than or equal to the present one.

Also note that I'm picking half + 1 -- that ensures I'll treat the middle element in odd-sized lists. I'm not dropping elements before reversing them because zip always pick the smallest size.

If we decide to require indexed seqs, the following is probably faster:

def dogTimes(data: IndexedSeq[Int], typeSum: Int): Map[Int, Int] = {
  import scala.annotation.tailrec

  val unwantedKey = typeSum / 2
  val end = data.length
  val halfway = end / 2

  val result = (0 to halfway).foldLeft(Map.empty[Int, Int]) {
    case (map, i) =>
      val x = data(i)
      val y = data(end - i - 1)
      val m1 = if (map.contains(x) || x == unwantedKey) map else map.updated(x, i + 1)
      if (m1.contains(y) || y == unwantedKey) m1 else m1.updated(y, i + 1)
  }

  result
}

Note also that I prefer in both examples to prevent the unwanted key from getting in the map instead of removing it afterwards. This might be a bad decision, but it's trivial to change the code to remove it at the end, so I decided to present you with the alternative.

Question 3

Scala has an elegant way to pair list items with their indexes: zipWithIndex. As you split up the list in two halves, you can make two match cases, with a condition on the first:

val typeSum = 8
val data = List(2, 3, 4, 5, 2, 3)
val dogTimes: scala.collection.mutable.Map[Int, Int] = scala.collection.mutable.Map() withDefaultValue (-1)
data.zipWithIndex foreach {
  case (value, index) if (index < data.length / 2) => {
    if (dogTimes(value) > index + 1 || dogTimes(value) < 0) {
      dogTimes(value) = index + 1
    }
  }
  case (value, index) => {
    if (dogTimes(value) > (data.length - index) || dogTimes(value) < 0) {
      dogTimes(value) = data.length - index
    }
  }
}
if (typeSum % 2 == 0) dogTimes(typeSum / 2) = -1