6 Matching Annotations
  1. Last 7 days
    1. reasoning models tend to produce much shorter reasoning traces (up to 50%) for the same problem under different context conditions compared to the traces produced when the problem is presented in isolation.

      令人震惊的发现:同一道题,仅仅因为周围塞入了无关上下文,推理模型的思考链长度就缩短了最多 50%——而题目本身一字未改。这意味着我们以为在评估模型「解题能力」,实际上评估的是「在特定上下文包装下的解题能力」。所有在孤立问题上测得的推理 benchmark,都可能严重高估了模型在真实 Agent 场景中的实际推理深度。

  2. Dec 2021
    1. Nature Portfolio. (2021, December 8). Estimates of COVID-19 vaccine uptake in the US based on large surveys that are used to guide policy-making decisions tend to overestimate the number of vaccinated individuals, according to research published in @Nature. Https://go.nature.com/3EBQPOh https://t.co/rSoclzWIdg [Tweet]. @NaturePortfolio. https://twitter.com/NaturePortfolio/status/1468633979364560899

  3. Jun 2021
  4. May 2021
    1. Dr Ellie Murray. (2021, May 7). I’m seeing a lot of “these people are over-estimating risk” chatter that doesn’t acknowledge that the probability you die if you get covid is always less than the probability anyone dies if you get covid. It’s not “over-estimation” to consider community impacts. [Tweet]. @EpiEllie. https://twitter.com/EpiEllie/status/1390792624777334797

  5. Mar 2021