16 Matching Annotations
  1. Last 7 days
    1. current approaches often rely on decoupled trigger-response pipelines or are limited to captioning-style narration, reducing their effectiveness for open-ended question answering and long-horizon interaction

      大多数人认为现有的视频大模型可以通过简单的触发-响应管道或描述式叙述来处理实时视频流,但作者认为这种方法对于开放式问答和长时程交互效果有限。这是一个反直觉的观点,因为它挑战了当前视频处理领域的常规做法,暗示需要更集成的端到端方法来真正实现实时视频理解。

    1. The cost of understanding what happens in a video has dropped by a factor of roughly 40, while the quality of that understanding has improved dramatically.

      大多数人认为AI视频分析仍处于早期阶段且成本高昂,但作者指出AI视频分析成本已大幅下降40倍,质量反而提升。这一反直觉观点暗示视频分析可能已经跨越了实用性的门槛,将催生全新的应用类别,挑战了人们对AI视频处理能力的传统认知。

    1. 让你能像导演一样控制 AI 视频的每个环节

      大多数人认为AI视频生成工具只能简单生成内容,而作者认为Wan2.7-Video已经进化为完整的导演工具套件,允许用户对视频进行全方位控制,这挑战了人们对AI视频生成工具只能单向输出的传统认知。

  2. Aug 2024
  3. Jul 2024
  4. Jun 2024
  5. May 2024
  6. Feb 2024
    1. This technical report focuses on (1) our method for turning visual data of all types into a unified representation that enables large-scale training of generative models, and (2) qualitative evaluation of Sora’s capabilities and limitations. Model and implementation details are not included in this report.

      AI to generate video images.

  7. Jun 2023
  8. Jul 2020
  9. Jun 2020
  10. May 2020
  11. Aug 2019
  12. Jul 2018