Pergunta

I've started reading more about statistical learning theory, specifically this paper right here and I cannot understand the following part:

It turns out the conditions required to render empirical risk minimization consistent involve restricting the set of admissible functions. The main insight of VC (Vapnik-Chervonenkis) theory is that the consistency of empirical risk minimization is determined by the worst case behavior over all functions f ∈ F that the learning machine could choose. We will see that instead of the standard law of large numbers introduced above, this worst case corresponds to a version of the law of large numbers which is uniform over all functions in F

The part that is in bold is what I do not get. This part specifically: "Consistency of ERM is determined by the worst case behavior over all functions in function space F, that learning machine could choose". What is exactly meant by the worse case behavior? I tried looking into VC theory and could not find the answer.

Foi útil?

Solução

What is considered in VC theory is about the bound of error between empirical risk and real expected risk. Hence, the worst-case function is when the difference between these two risks is maximized.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top