2 Matching Annotations
  1. Oct 2023
    1. (Chen, NeurIPS, 2021) Che1, Lu, Rajeswaran, Lee, Grover, Laskin, Abbeel, Srinivas, and Mordatch. "Decision Transformer: Reinforcement Learning via Sequence Modeling". Arxiv preprint rXiv:2106.01345v2, June, 2021.

      Quickly a very influential paper with a new idea of how to learn generative models of action prediction using SARSA training from demonstration trajectories. No optimization of actions or rewards, but target reward is an input.

  2. Nov 2022
    1. “The metaphor is that the machine understands what I’m saying and so I’m going to interpret the machine’s responses in that context.”

      Interesting metaphor for why humans are happy to trust outputs from generative models