2 Matching Annotations
  1. Last 7 days
    1. A gameplay clip is not merely pixels. It is pixels plus choices.

      极其精辟地概括了具身智能下一步的数据瓶颈。语言模型用互联网文本训练,但缺乏对物理世界因果关系的理解。游戏视频包含了“感知-决策-反馈”的完整闭环,这种带有动作标签的数据可能成为下一代大模型突破通用性的关键预训练基座。

  2. May 2026
    1. Gemini Robotics Perceive, reason, use tools and interact

      The explicit inclusion of 'use tools' alongside core cognitive functions like 'perceive' and 'reason' highlights a significant architectural focus on embodied AI. This suggests the model is being designed with a direct path to physical agency, a non-obvious but critical distinction.