1 Matching Annotations
  1. Jul 2023
    1. Paper that introduced the PPO algorithm. PPO is, in a way, a response to the TRPO algorithm, trying to use the core idea but implement a more efficient and simpler algorithm.

      TRPO defines the problem as a straight optimization problem, no learning is actually involved.