Question

I am working on the power management of a system. The objectives that I am looking to minimize are power consumption and average latency. I have a single objective function having the linearly weighted sum of both the objectives:

C=w.P_avg+(1-w).L_avg,      where w belongs to (0,1)

I am using Q-learning to find a pareto-optimal trade-off curve by varying the weight w and setting different preference to power consumption and average latency. I do obtain a pareto-optimal curve. My objective, now, is to provide a constraint (e.g., average latency L_avg) and thus tuning/finding the value of w to meet the given criteria. Mine is an online algorithm, so the tuning of w should take place in an online fashion.

Could I be provided any hint or suggestions in this regard?

Was it helpful?

Solution

There is a multiple-objective Reinforcement Learning branch in the community.

The idear is to 1:

assign a family of agents to each objective. The solutions obtained by the agents in one family are compared with the solutions obtained by the agents from the rest of the families. A negotiation mechanism is used to find compromise solutions satisfying all the objectives.

Also there a paper that might be interest to you:

Multi-objective optimization by reinforcement learning for power system dispatch and voltage stability.

I did not find a public url for it though.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top