Question

I just learned how to convert a list of integers to a map with frequency per bin in scala.

How to convert list of integers to a map with frequency per bin in scala

However I am working with a 22 GB file therefore I am streaming trough the file.

Source.fromFile("test.txt").getLines.filter(x => x.charAt(0) != '#').map(x => x.split("\t")(1)).map(x => x.toInt)

The groupby function only works on a list, not on an iterator. I guess because it needs all the values in memory. I can't convert the iterator to an list because of the file size.

So an example would be

List(1,2,3,101,330,302).iterator

And how can I go from there to

res1: scala.collection.immutable.Map[Int,Int] = Map(100 -> 1, 300 -> 2, 0 -> 3)
Was it helpful?

Solution

You may use fold:

val iter = List(1,2,3,101,330,302).iterator

iter.foldLeft(Map[Int, Int]()) {(accum, a) => 
                                  val key = a/100 * 100;
                                  accum + (key  -> (accum.getOrElse(key, 0) + 1))}

// scala.collection.immutable.Map[Int,Int] = Map(0 -> 3, 100 -> 1, 300 - 2)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top