3 Matching Annotations
  1. Last 7 days
    1. AI solutions were graded by the official judges, using the same criteria as were applied to human solutions.

      这个描述表明2025年IMO数学竞赛中使用了与人类相同的评判标准,这是AI评估方法的重要转变。这一数据点展示了如何利用现有的专业评估体系来创建更严格的基准测试。

  2. Sep 2020
  3. Jul 2020