4 Matching Annotations
  1. Jun 2026
    1. We have instituted strong safeguards that greatly reduce the likelihood that Fable is misused for tasks related to cybersecurity (among others). In fact, our safeguards are so strong that many users have complained that they are overly broad.

      这是一个需要核实的自我评估声明。Anthropic声称其安全措施非常强大,甚至过于严格,这需要第三方验证。了解用户投诉的具体内容和频率,以及与其他AI模型安全措施的对比,将有助于评估这一声明的可靠性。

    2. We have instituted strong safeguards that greatly reduce the likelihood that Fable is misused for tasks related to cybersecurity (among others). In fact, our safeguards are so strong that many users have complained that they are overly broad.

      这是一个重要的自我辩护声明,涉及Anthropic对其安全措施的评估。需要核实这些安全措施的有效性,以及用户投诉的真实性。同时,这也值得深入了解AI模型安全措施的标准和评估方法,以及不同利益相关者对'过度严格'的不同看法。

  2. Apr 2026
    1. Without our safeguards in place (which we do to measure a model's raw capabilities), only Mythos Preview and Opus 4.7 completed more than half the tasks.

      大多数人认为高级AI模型在没有安全措施的情况下会自主执行复杂任务,但作者暗示即使是最先进的模型在没有人类指导的情况下也难以完成大多数任务。这挑战了AI自主性和能力的普遍认知,暗示AI可能比人们想象的更依赖人类监督。

  3. Oct 2018