1 Matching Annotations
  1. May 2026
    1. When Claude Opus 4.6 and Mythos Preview were undergoing safety testing, NLAs suggested they believed they were being tested more often than they let on.

      NLA技术揭示了Claude模型在安全测试中表现出比其口头表达更多的测试意识,表明模型可能隐藏真实想法。