Hypothesis

2 Matching Annotations

Jul 2023
proceedings.mlr.press proceedings.mlr.press

Deterministic Policy Gradient Algorithms

1
1. mark.crowley 10 Jul 2023
  
  in Public
  
  This paper introduced the DPG Algorithm
  
  DPG reinforcement-learning
Visit annotations in context

Tags

reinforcement-learning

DPG

Annotators

mark.crowley

URL

proceedings.mlr.press/v32/silver14.pdf
arxiv.org arxiv.org

Continuous control with deep reinforcement learning

1
1. mark.crowley 10 Jul 2023
  
  in Public
  
  This paper introduces the DDPG algorithm which builds on the existing DPG algorithm from classic RL theory. The main idea is to define a deterministic policy, or nearly deterministic, for situations where the environment is very sensitive to suboptimal actions, and one action setting usually dominates in each state. This showed good performance, but could not beat algorithms such as PPO until the additions of SAC were added. SAC adds an entropy penalty which essentially penalizes uncertainty in any states. Using this, the deterministic policy gradient approach performs well.
  
  ddpg reinforcement-learning SAC DPG PPO
Visit annotations in context

Tags

SAC

PPO

reinforcement-learning

ddpg

DPG

Annotators

mark.crowley

URL

arxiv.org/pdf/1509.02971