1 Matching Annotations
  1. Last 7 days
    1. Claude Opus 4.7 is the strongest model Hex has evaluated. It correctly reports when data is missing instead of providing plausible-but-incorrect fallbacks, and it resists dissonant-data traps that even Opus 4.6 falls for.

      这一发现揭示了AI模型认知诚实性的重要进步,不再为了提供答案而编造信息,这种对不确定性的诚实处理是AI系统可靠性的关键指标,比单纯的准确率更重要。