4 Matching Annotations
  1. Nov 2022
    1. Extractive summarization may be regarded as acontextual bandit as follows. Each document is acontext, and each ordered subset of a document’ssentences is a different action

      We can represent extractive summarization as a bandit problem by treating the document as the context and possible reorderings of sentences as actions an agent could take

    2. andit is a decision-making formal-ization in which an agent repeatedly chooses oneof several actions, and receives a reward based onthis choice.

      Definition for contextual bandit: an agent that repeatedly choses one of several actions and receives a reward based on this choice.

  2. Oct 2020
    1. Most people seem to follow one of two strategies - and these strategies come under the umbrella of tree-traversal algorithms in computer science.

      Deciding whether you want to go deep into one topic, or explore more topics, can be seen as a choice between two types of tree-traversal algorithms: depth-first and breadth-first.

      This also reminds me of the Explore-Exploit problem in machine learning, which I believe is related to the Multi-Armed Bandit Problem.