3 Matching Annotations
  1. Last 7 days
    1. we conduct a systematic evaluation of multiple reasoning models across three scenarios: (1) problems augmented with lengthy, irrelevant context; (2) multi-turn conversational settings with independent tasks; and (3) problems presented as a subtask within a complex task.

      三个测试场景的设计极具现实针对性:场景一对应「RAG 检索塞入大量背景文档」,场景二对应「多轮对话历史积累」,场景三对应「Agent 工作流中的子任务分解」。这三个场景恰好覆盖了当前 AI 产品的主流部署模式——这篇论文实际上是在说:我们正在大规模生产的所有 AI 产品,都可能在不知情的情况下运行着推理能力受损的模型。

  2. Mar 2019
    1. This is associated with the e-learning development tool "Articulate Storyline." There are frequent blog posts and they are not limited to or exclusive to the Articulate products. Posts are brief and not all of the content will be new, but there are worthwhile tips to be had and they combine theory (not to the extent that an academic would) with practice. rating 3/5

  3. Jun 2015