Вопрос

I'm trying to use the takeSample() function in Spark and the parameters are - data, number of samples to be taken and the seed. But I don't want to use the seed. I want to have a different answer everytime. I'm not able to figure out how I can do that. I tried using System.nanoTime as the seed value but it gave an error since I think the data type didn't match. Is there any other function similar to takeSample() that can be used without the seed? Or is there any other implementation I can use with takeSample() so that I get a different output every time.

Это было полезно?

Решение

System.nanoTime is of type long, the seed expected by takeSample is of type Int. Hence, takeSample(..., System.nanoTime.toInt) should work.

Другие советы

System.nanoTime returns Long, whereas takeSample expects an Int.
You can feed scala.util.Random.nextInt as a seed value to the takeSample function.

As of Spark version 1.0.0, the seed parameter is optional. See https://issues.apache.org/jira/browse/SPARK-1438.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top