Question

There is a data mining competition website called numer.ai.

Presumably, behind the website is a hedge fund which makes use of the predictions that people send. People within the 100th top places continuously make money, until the next dataset is revealed, and the competition resets.

What I don't understand is that websites like Kaggle avoid overfitting by having a public and a private leaderboard. The private leaderboard is only revealed at the end of the competition and only then are prizes handed out.

Numerai says under rules that it uses the same approach. Quoting:

If models overfit the public leaderboard and perform poorly on the private leaderboard, those users will suffer penalties potentially eradicating all gains. This discourages overfitting.

What do they mean by "penalties" being applied? I know someone who is making money on that website by continuously playing the public leaderboard. Does this mean that if he does not withdraw his bitcoin he is in danger of losing what he has earned in case he is overfitting?

Was it helpful?

Solution

I found the answer in the comments section of their blog:

Earnings listed on the public-score leaderboard are potential winnings. Actual winnings are determined by the private-score leaderboard. At the time of withdrawal and when new datasets are released, your actual winnings will be revealed to you.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top