Question

When I talk about policy optimization, it is referred to the following picture, and it is linked to DFO/Evolution plus Policy Gradients. Picture

I would like to know is it correct to say: Policy Optimization learns policies to make better actions with higher probability?

Also, what is the location of Proximal Policy Optimization in the picture?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top