How to formulate reward of an rl agent with two objectives
-
02-11-2019 - |
Pregunta
I have started learning reinforcement learning and trying to apply it for my use case. I am developing an rl agent which can maintain temperature at a particular value, and minimize the energy consumption if equipment by taking different actions that are available for it to take.
I am trying to formulate a reward function for it.
energy and temp_act can be measured
energy_coeff = -10
temp_coeff = -10
temp_penalty = np.abs(temp_setpoint - temp_act)
reward = energy_coeff * energy + temp_coeff * temp_penalty
This is the reward function I am using, but intuitively , I feel it should be better. because absolute value of enenrgy and temp_penalty are on different scales. How do i take into count the scaling problem, while structuring a reward.
No hay solución correcta
Licenciado bajo: CC-BY-SA con atribución
No afiliado a datascience.stackexchange