How to formulate reward of an rl agent with two objectives

https://datascience.stackexchange.com/questions/60303

reinforcement-learning
q-learning
monte-carlo
dqn
discounted-reward

02-11-2019
|

Pregunta

I have started learning reinforcement learning and trying to apply it for my use case. I am developing an rl agent which can maintain temperature at a particular value, and minimize the energy consumption if equipment by taking different actions that are available for it to take.

I am trying to formulate a reward function for it.

energy and temp_act can be measured

energy_coeff = -10
temp_coeff = -10

temp_penalty = np.abs(temp_setpoint - temp_act)

reward = energy_coeff * energy + temp_coeff * temp_penalty

This is the reward function I am using, but intuitively , I feel it should be better. because absolute value of enenrgy and temp_penalty are on different scales. How do i take into count the scaling problem, while structuring a reward.

No hay solución correcta

Licenciado bajo: CC-BY-SA con atribución

No afiliado a datascience.stackexchange