SparkPi running slow with more than 1 slice

Question 1

The random function used in sparkpi example is a synchronized method and can't scale to multiple cores. It's an easy enough example to deploy on your cluster but don't use it to check Spark's performance and scalability.

Question 2

As Ahsan mentioned in his answer, the problem was with 'scala.math.random'. I have replaced it with 'org.apache.spark.util.random.XORShiftRandom', and now using multiple processors makes the Pi calculations to run much faster. Below is my code, which is a modified version of SparkPi example from Spark distribution:

// scalastyle:off println
package org.apache.spark.examples

import org.apache.spark.util.random.XORShiftRandom

import org.apache.spark._

/** Computes an approximation to pi */
object SparkPi {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Spark Pi").setMaster(args(0))
    val spark = new SparkContext(conf)
    val slices = if (args.length > 1) args(1).toInt else 2
    val n = math.min(100000000L * slices, Int.MaxValue).toInt // avoid overflow
    val rand = new XORShiftRandom()

    val count = spark.parallelize(1 until n, slices).map { i =>
        val x = rand.nextDouble * 2 - 1
        val y = rand.nextDouble * 2 - 1
        if (x*x + y*y < 1) 1 else 0
      }.reduce(_ + _)

    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}
// scalastyle:on println

When I run the program above using one core with parameters 'local[1] 16' it takes about 60 seconds on my laptop. Same program using 8 cores ('local[*] 16') it takes 17 seconds.