Question

All maps in this code are mutable maps due to the import statement earlier on in the full code. The nGramGetter.getNGrams(...) method call returns a Map[String, Int].

  def train(files: Array[java.io.File]): Map[Char, Map[Int, Double]] = {
    val scores = Map[Char, Map[Int, Double]]().withDefault( x => Map[Int, Double]().withDefaultValue(0.0)) 

    for{
      i <- 1 to 4
      nGram <- nGramGetter.getNGrams(files, i).filter( x => (x._1.size == 1 || x._2 > 4) && !hasUnwantedChar(x._1) )
      char <- nGram._1
    } scores(char)(i) += nGram._2
    println(scores.size)
    val nonUnigramTotals = scores.mapValues( x => x.values.reduce(_+_)-x(1) )    

    val unigramTotals = scores.mapValues( x => x(1) )

    scores.map( x => x._1 -> x._2.map( y => y._1 -> (if(y._1 > 1) y._2/unigramTotals(x._1) else (y._2-nonUnigramTotals(x._1))/unigramTotals(x._1)) ) )
  }

I have replaced the scores(char)(i) += nGram._2 line with a few print statements (printing the keys, values and individual chars in each key) to check the output, and the method call is NOT returning an empty list. The line that prints the size of scores, however, is printing a zero. I am almost sure I have used exactly this method to populate a frequency map before, but this time, the map always comes out empty. I have changed withDefault to withDefaultValue and passed in the result of the current function literal as the argument. I have tried both withDefault and withDefaultValue with Map[Int, Double](1->0.0,2->0.0,3->0.0,4->0.0). I am a bit of a Scala noob, so maybe I just don't understand something about the language that is causing the problem. Any idea what's wrong?

Was it helpful?

Solution

The methods withDefault and withDefaultValue do not change the map. Instead, they simply return a default value. Let's remove the syntactic sugar form your statement to see where it goes wrong:

scores(char)(i) += nGram._2
scores(char)(i) = scores(char)(i) + nGram._2
scores.apply(char)(i) = scores.apply(char)(i) + nGram._2
scores.apply(char).update(i, scores.apply(char).apply(i) + nGram._2)

Now, since scores.apply(char) does not exist, a default is being returned, Map[Int, Double]().withDefaultValue(0.0), and that map gets modified. Unfortunately, it never gets assigned to scores, because no update method is called on it. Try this code below -- it's untested, but it shouldn't be hard to get it to work:

scores(char) = scores(char) // initializes the map for that key, if it doesn't exist
scores(char)(i) += nGram._2
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top