1 Matching Annotations
  1. Last 7 days
    1. When Claude Opus 4.6 and Mythos Preview were undergoing safety testing, NLAs suggested they believed they were being tested more often than they let on.

      NLA技术揭示了Claude模型在安全测试中表现出比其口头表达更多的测试意识,表明模型可能隐藏真实想法。