Hypothesis

2 Matching Annotations

Apr 2026
artificialanalysis.ai artificialanalysis.ai

APEX-Agents-AA Benchmark Leaderboard | Artificial Analysis

1
1. fxp007 10 Apr 2026
  
  in Public
  
  Qwen3.5 397B A17B: 15.3%, DeepSeek V3.2: 14.5%, GLM-5: 14.5%, Kimi K2.5: 11.5%, MiniMax-M2.7: 10.6%
  
  中美专业服务 Agent 的差距在这里变得具体可见：顶级美国模型 33%，中国最强开源模型（Qwen3.5、DeepSeek、GLM-5）约 14-15%，差距超过 2 倍。更值得注意的是智谱 AI 的 GLM-5 与 DeepSeek V3.2 并列，说明在专业服务 Agent 这个维度，国内头部玩家的能力相当接近。对于智谱的战略意义：这个 2 倍差距是否可以通过领域专精（比如专注于中国本土金融场景）来弥补？
  
  China-US-gap GLM-5 DeepSeek 2x-gap Zhipu-AI
Visit annotations in context

Tags

2x-gap

Zhipu-AI

DeepSeek

GLM-5

China-US-gap

Annotators

fxp007

URL

artificialanalysis.ai/evaluations/apex-agents-aa
epoch.ai epoch.ai

Keeping up with the GPTs | Epoch AI

1
1. fxp007 09 Apr 2026
  
  in Public
  
  Just last year, Anthropic spent over ten times more on compute than Minimax and Zhipu AI combined, and the gap is even wider for OpenAI:
  
  这个数字对国内 AI 从业者而言极为刺耳：Anthropic 一家的算力投入就超过智谱 AI 和 MiniMax 合计的十倍以上，而与 OpenAI 相比差距更大。所谓「中美 AI 竞争激烈」的叙事背后，是一场体量悬殊的不对称战争——不是同一量级的竞争，而是大卫与歌利亚的对决。对智谱这样的公司，这既是警醒，也是生存战略的根本约束。
  
  Zhipu-AI MiniMax compute-gap China-US-AI surprising
Visit annotations in context

Tags

Zhipu-AI

compute-gap

surprising

China-US-AI

MiniMax

Annotators

fxp007

URL

epoch.ai/gradient-updates/keeping-up-with-the-gpts/

Tags

Annotators

URL

Tags

Annotators

URL