current position:Home>When using PPO strategy to train reinforcement learning model, how to evaluate the quality of the model and judge whether it converges?

When using PPO strategy to train reinforcement learning model, how to evaluate the quality of the model and judge whether it converges?

2022-02-02 15:41:06 CSDN Q & A

Recently, we are taking advantage of reinforcement learning PPOPolicy Training models , The final output result is shown in the figure , I'd like to ask how to judge whether the model converges and what indicators are used to judge the advantages and disadvantages of the model

img

copyright notice
author[CSDN Q & A],Please bring the original link to reprint, thank you.
https://en.primo.wiki/2022/02/202202021541043367.html

Random recommended