6 Matching Annotations
  1. Last 7 days
    1. the robustness of these reasoning behaviors remains underexplored

      「推理行为的鲁棒性尚未被充分探索」——这句话是整个推理模型研究领域的集体盲点声明。过去两年,测试时计算(test-time compute)、长思维链(CoT)、o1/R1 类推理模型吸引了巨大关注,但几乎所有评测都在「孤立问题」环境下进行。在真实 Agent 部署场景中,「能否保持推理深度」这个最基本的可靠性问题,直到这篇论文才开始被系统研究。

  2. Nov 2021
  3. Sep 2021
    1. At the time of the beginning of the research, very little had been written on middle- aged women; collectively as social scientists we knew next to nothing about the middle years of adult life. We were critical of what little literature existed and were skeptical of widely held assumptions about women of this age.

      Social science literature absent the experience of middle-aged women. Interregate empty next syndrome.

    Tags

    Annotators

  4. Aug 2021
  5. Jun 2021
  6. May 2021