5 Matching Annotations
  1. Last 7 days
    1. The company says it has only seen evidence of this kind of jailbreak being used to find 'minor' and 'relatively simple' software vulnerabilities

      大多数人认为AI模型的安全漏洞都可能导致严重后果,但作者指出Anthropic发现的所谓'越狱'只能找到'次要'和'相对简单'的软件漏洞,这挑战了政府对模型安全威胁的严重性评估,暗示政府反应过度。

    1. The potential jailbreaks that have been disclosed to us are either entirely benign responses or are minor findings that provide no Mythos-specific uplift.

      大多数人认为政府发现的AI模型漏洞应该是严重的安全威胁,但作者认为被披露的潜在越狱要么是完全良性的响应,要么是次要发现,没有提供Mythos特有的提升。这挑战了政府对AI安全威胁严重性的主流认知。

  2. Aug 2022
  3. Apr 2020