Domanda

I just discovered that unapply in my extractor is being called twice for some reason. Anyone know why, and how to avoid it?

val data = List("a","b","c","d","e")

object Uap {
  def unapply( s:String ) = {
    println("S: "+s)
    Some(s+"!")
  }             
}

println( data.collect{ case Uap(x) => x } )

This produces output:

S: a
S: a
S: b
S: b
S: c
S: c
S: d
S: d
S: e
S: e
List(a!, b!, c!, d!, e!)

The final result is fine but in my real program the unapply is non-trivial, so I certainly don't want to call it twice!

È stato utile?

Soluzione

collect takes a PartialFunction as input. PartialFunction defines two key members: isDefinedAt and apply. When collect runs your function, it runs your extractor once to determine if your function isDefinedAt some particular input, and if it is, then it runs the extractor again as part of apply to extract the value.

If there is a trivial way of correctly implementing isDefinedAt, you could implement this yourself by implementing your own PartialFunction explicitly, instead of using the case syntax. or you could do a filter and then map with a total function on the collection (which is essentially what collect is doing by calling isDefinedAt, then apply)

Another option would be to lift the Partial function to a total function. PartialFunction defines lift which turns a PartialFunction[A,B] into a A=>Option[B]. You could use this lifted function (call it fun) to do: data.map(fun).collect { case Some(x) => x }

Altri suggerimenti

Actually, this was addressed in 2.11 as a performance bug:

$ skala
Welcome to Scala version 2.11.0-20130423-194141-5ec9dbd6a9 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_06).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val data = List("a","b","c","d","e")
data: List[String] = List(a, b, c, d, e)

scala> 

scala> object Uap {
     |   def unapply( s:String ) = {
     |     println("S: "+s)
     |     Some(s+"!")
     |   }             
     | }
defined object Uap

scala> 

scala> println( data.collect{ case Uap(x) => x } )
S: a
S: b
S: c
S: d
S: e
List(a!, b!, c!, d!, e!)

See the efficiency notes on applyOrElse.

Here's a version for 2.10, where the issue is easily remedied by extension:

object Test extends App {
  import scala.collection.TraversableLike
  import scala.collection.generic.CanBuildFrom
  import scala.collection.immutable.StringLike

  implicit class Collector[A, Repr, C <: TraversableLike[A, Repr]](val c: C) extends AnyVal {
    def collecting[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
      val b = bf(c.repr)
      c.foreach(pf.runWith(b += _))
      b.result
    }
  }

  val data = List("a","b","c","d","e")

  object Uap {
    def unapply( s:String ) = {
      println("S: "+s)
      s match {
        case "foo" => None
        case _     => Some(s+"!")
      }
    }
  }
  val c = Collector[String, List[String], List[String]](data)
  Console println c.collecting { case Uap(x) => x }
}

Result:

$ scalac -version
Scala compiler version 2.10.1 -- Copyright 2002-2013, LAMP/EPFL

apm@halyard ~/tmp
$ scalac applyorelse.scala ; scala applyorelse.Test
S: a
S: b
S: c
S: d
S: e
List(a!, b!, c!, d!, e!)

Note that this version of Uap is partial:

scala> val data = List("a","b","c","d","e", "foo")
data: List[String] = List(a, b, c, d, e, foo)

scala> data.map{ case Uap(x) => x }
S: a
S: b
S: c
S: d
S: e
S: foo
scala.MatchError: foo (of class java.lang.String)

I think that if the use case is PF, the code should be partial.

Adding to @stew answer, collect is implemented as:

def collect[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- this) if (pf.isDefinedAt(x)) b += pf(x)
b.result
}

It uses pf.isDefinedAt(x). Doing scalac -Xprint:typer check.scala (check.scala contains your code). It prints:

....
final def isDefinedAt(x1: String): Boolean = ((x1.asInstanceOf[String]:String): String @unchecked) match {
      case check.this.Uap.unapply(<unapply-selector>) <unapply> ((x @ _)) => true
      case (defaultCase$ @ _) => false
    }

So as you see, it calls unapply here again. This explains why it prints twice i.e. once to check if it is defined and then next when it is already called in `pf(x).

@som-snytt is right. As of Scala 2.11, collect function in TraversableLike is changed to:

def collect[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
foreach(pf.runWith(b += _))
b.result
}

The reason it prints only once is that, internally it calls applyOrElse which checks if it is defined. If yes applies the function there it self (in above case (b += _)). Hence it prints only once.

You can use map instead:

scala>    println( data.map{ case Uap(x) => x } )
S: a
S: b
S: c
S: d
S: e
List(a!, b!, c!, d!, e!)

No idea why it works that why.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top