Optimizing list processing in Scala

Question 1

I don't know the full extent of your processing, what pipelines you have going on, what stress your server is under or what sort of threading profile you've set up to receive the information. However, assuming that you've correctly separated I/O from CPU bound tasks and what you've shown us is strictly CPU bound try simply adding .par to the very first Map.

people.data.values.par.map(b =>

as a first pass to see if you can get some performance gains. I don't see any specific ordering required of the processing which tells me it's ripe for parallelization.

Edit

After playing around with parallelization, I would add that modifying the TaskSupport is helpful for this case. You can modify a parallelized collection's tasksupport as such:

import scala.collection.parallel._
val pc = mutable.ParArray(1, 2, 3)
pc.tasksupport = new ForkJoinTaskSupport(
  new scala.concurrent.forkjoin.ForkJoinPool(2))

See http://www.scala-lang.org/api/2.10.3/index.html#scala.collection.parallel.TaskSupport

Question 2

I have some suggestions that might help.

I would try to move the filter command as early in the program as possible. Since your data contains many dates with 0 activity you would see improvements doing this. The best solution might be to test for this while parsing the json data. If this is not possible make it the first statement.
The way I understand it you would like to end up with a way to look up a aggregate of the sums for a given id. I would suggest you represent this with a map from the id to the aggregate. Also the scala List class has a sum function. I came up with this code:

val originalList_IdToAggregate = people.data.values.map(p=> (p._2._1, p._2._2.sum) );

It might not match your project directly, but I think it is almost what you need. If you need to make a map of this you just append toMap to the end.
If this doesn't give you enough speed you could create your own parser that aggregates and filters while parsing only this kind of json. Writing parsers is quite easy in scala if you are using the parser combinators. Just keep in mind to throw away what you don't need as early as possible and not to make too many deep branches this should be a fast solution with a low memory footprint.
As for going parallel this can be a good idea. I don't know enough about your application to tell you what is the best way, but it might be possible to hide the computational cost of processing the data under the cost of transporting the data. Try to balance parsing and io over multiple threads and see if you can achieve this.

Optimizing list processing in Scala

Edit: sample data as requested

Edit