Question

Is it always more performant to use withFilter instead of filter, when afterwards applying functions like map, flatmap etc.?

Why are only map, flatmap and foreach supported? (Expected functions like forall/exists as well)

Was it helpful?

Solution

From the Scala docs:

Note: the difference between c filter p and c withFilter p is that the former creates a new collection, whereas the latter only restricts the domain of subsequent map, flatMap, foreach, and withFilter operations.

So filter will take the original collection and produce a new collection, but withFilter will non-strictly (i.e. lazily) pass unfiltered values through to later map/flatMap/withFilter calls, saving a second pass through the (filtered) collection. Hence it will be more efficient when passing through to these subsequent method calls.

In fact, withFilter is specifically designed for working with chains of these methods, which is what a for comprehension is de-sugared into. No other methods (such as forall/exists) are required for this, so they have not been added to the FilterMonadic return type of withFilter.

OTHER TIPS

In addition of the excellent answer of Shadowlands, I would like to bring an intuitive example of the difference between filter and withFilter.

Let's consider the following code

val list = List(1, 2, 3)
var go = true
val result = for(i <- list; if(go)) yield {
   go = false
   i
}

Most people expect result to be equal to List(1). This is the case since Scala 2.8, because the for-comprehension is translated into

val result = list withFilter {
  case i => go
} map {
  case i => {
    go = false
    i
  }
}

As you can see the translation converts the condition into a call to withFilter. Prior Scala 2.8, for-comprehension were translated into something like the following:

val r2 = list filter {
  case i => go
} map {
  case i => {
    go = false
    i
  }
}

Using filter, the value of result would be fairly different: List(1, 2, 3). The fact that we're making the go flag false has no effect on the filter, because the filter is already done. Again, in Scala 2.8, this issue is solved using withFilter. When withFilter is used, the condition is evaluated every time an element is accessed inside a map method.

Reference: - p.120 ,Scala in action (covers Scala 2.10), Manning Publications, Milanjan Raychaudhuri - Odersky's thoughts about for-comprehension translation

The main reason because forall/exists aren't implemented is that the use case is that:

  • you can lazily apply withFilter to an infinite stream/iterable
  • you can lazily apply another withFilter (and again and again)

To implement forall/exists we need to obtain all the elements, loosing the lazyness.

So for example:

import scala.collection.AbstractIterator

class RandomIntIterator extends AbstractIterator[Int] {
  val rand = new java.util.Random
  def next: Int = rand.nextInt()
  def hasNext: Boolean = true
}

//rand_integers  is an infinite random integers iterator
val rand_integers = new RandomIntIterator

val rand_naturals = 
    rand_integers.withFilter(_ > 0)

val rand_even_naturals = 
    rand_naturals.withFilter(_ % 2 == 0)

println(rand_even_naturals.map(identity).take(10).toList)

//calling a second time we get
//another ten-tuple of random even naturals
println(rand_even_naturals.map(identity).take(10).toList)

Note that ten_rand_even_naturals is still an iterator. Only when we call toList the random numbers will be generated and filtered in chain

Note that map(identity) is equivalent to map(i=>i) and it is used here in order to convert a withFilter object back to the original type (eg a collection , a stream, an iterator)

For the forall/exists part:

someList.filter(conditionA).forall(conditionB)

would be the same as (though a little bit un-intuitive)

!someList.exists(conditionA && !conditionB)

Similarly, .filter().exists() can be combined into one exists() check?

Using for yield can be a work around, for example:

for {
  e <- col;
  if e isNotEmpty
} yield e.get(0)

As a workaround, you can implement other functions with only map and flatMap.

Moreover, this optimisation is useless on small collections…

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top