Question

I'm trying to use the takeSample() function in Spark and the parameters are - data, number of samples to be taken and the seed. But I don't want to use the seed. I want to have a different answer everytime. I'm not able to figure out how I can do that. I tried using System.nanoTime as the seed value but it gave an error since I think the data type didn't match. Is there any other function similar to takeSample() that can be used without the seed? Or is there any other implementation I can use with takeSample() so that I get a different output every time.

Was it helpful?

Solution

System.nanoTime is of type long, the seed expected by takeSample is of type Int. Hence, takeSample(..., System.nanoTime.toInt) should work.

OTHER TIPS

System.nanoTime returns Long, whereas takeSample expects an Int.
You can feed scala.util.Random.nextInt as a seed value to the takeSample function.

As of Spark version 1.0.0, the seed parameter is optional. See https://issues.apache.org/jira/browse/SPARK-1438.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top