Pregunta

I have started learning reinforcement learning and trying to apply it for my use case. I am developing an rl agent which can maintain temperature at a particular value, and minimize the energy consumption if equipment by taking different actions that are available for it to take.

I am trying to formulate a reward function for it.

energy and temp_act can be measured

energy_coeff = -10
temp_coeff = -10

temp_penalty = np.abs(temp_setpoint - temp_act)

reward = energy_coeff * energy + temp_coeff * temp_penalty

This is the reward function I am using, but intuitively , I feel it should be better. because absolute value of enenrgy and temp_penalty are on different scales. How do i take into count the scaling problem, while structuring a reward.

No hay solución correcta

Licenciado bajo: CC-BY-SA con atribución
scroll top