How to distinguish between different values of a hyperparameter in communication?

https://datascience.stackexchange.com/questions/80045

13-12-2020
|

Pergunta

In machine learning, a hyperparameter is a parameter whose value is used to control the learning process.

If we go by the definition of parameter in What's the difference between an argument and a parameter?, then a hyperparameter wouldn't be considered a value itself. Rather, it would be the spec for this value (for example, this value is a learning rate, or this value is a dropout). This also seems to agree with the way people use "hyperparameter" in communication.

There are a lot of times where I want to distinguish between particular values of a hyperparameter. Going by the question I linked, I could say something like hyperargument. For example, the hyperargument 0.001 is different than the hyperargument 0.01, even though they are under the same hyperparameter (let's say learning rate). This is what I do in my own codebase for variable names. For example, if I had this dict:

hyperarguments = {
    "learning_rate": 0.001,
    "dropout": 0.25,
}

I would iterate over it like this:

for hyperparameter, hyperargument in hyperarguments.items():
    ...

Is there a more common terminology to emphasize this distinction of values?

Solução

TLDR: Basically, you have answered your question with your quote from Wikipedia: the common terminology is "values".

The question on SE SO which you have quoted discusses the terminology from a software development perspective. But there is a difference between the use of the word "parameter" in software development and machine learning. This is how Wikipedia defines parameter from a software development perspective:

In computer programming, a parameter or a formal argument, is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments (often called actual arguments or actual parameters) with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

Let's make an example:

n_estimators = 100
model = RandomForestRegressor(n_estimators=n_estimators)

From a software development perspective n_estimators is a parameter, 100 is the parameter's value and n_estimators=n_estimators is the argument.

Howevever, in machine learning terminology you'd consider n_estimators a hyperparameter and 100 its value. Going back to your examples I would just do something like this:

hyperparameter_dict = {
    "learning_rate": 0.001,
    "dropout": 0.25,
}

and

for hyperparameter, value in hyperparameter_dict.items():
    ...

Introducing a new terminology which is not common in machine learning (like "hyperarguments") would only cause confusion and not make your code or communication any clearer.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange