27 Matching Annotations
  1. Last 7 days
    1. HappyHorse is built around a 15-billion-parameter unified self-attention Transformer that processes text, image, video, and audio tokens within a single token sequence. Unlike many competitors that stitch together separate models for video and audio

      大多数人认为多模态AI模型需要整合多个专门模型来处理不同类型的数据,但作者认为Alibaba的HappyHorse使用统一架构处理所有模态,这挑战了'多模态AI需要模块化设计'的行业共识。这种统一架构可能代表AI模型设计的范式转变,暗示未来多模态系统将更加一体化而非模块化。

  2. Jun 2026
    1. But by letting you generate a world for so long, the model also degrades significantly.

      大多数人认为长时间生成能力是AI世界模型的进步标志,但作者指出这种能力实际上伴随着模型一致性迅速下降的问题。这挑战了我们对AI模拟质量与持续时间关系的传统认知,暗示当前世界模型在保持长时间一致性方面存在根本性局限。

    1. This whole ecosystem is heavily, heavily subsidized by investor money. And so stuff that seems like it has no cost is, in fact, incredibly expensive.

      大多数人认为AI服务的低成本或免费是因为技术进步带来的自然结果,但作者认为这种低成本实际上是投资者补贴的产物,本质上是极其昂贵的。这一观点挑战了人们对AI服务经济性的普遍认知,揭示了当前AI商业模式背后的真实成本结构。

    1. In pixel-native generation, more inference often means sampling more outputs: generate twenty images, pick the best one, maybe try again. That is useful, but every attempt is mostly a new roll of the dice.

      作者认为当前主流的像素原生生成方法本质上是在'掷骰子',每次尝试都是全新的随机生成。这一观点挑战了当前扩散模型通过增加推理次数提升质量的共识,暗示这种方法效率低下且缺乏系统性改进。

  3. May 2026
    1. If most efficiency improvements came from a small handful of scale-dependent innovations, then existing models of the software intelligence explosion may be flawed.

      Explosion models fundamentally wrong

      Most AI safety models assume continuous innovation, but author shows progress from few scale-dependent innovations breaks these models.

  4. Apr 2026
    1. Three of the four metrics (ECI, log METR 50% time horizon, and a math-focused index we constructed from several math benchmarks) show strong evidence that progress has sped up relative to a global linear trend fit to data from 2023 onward.

      大多数人认为AI能力提升是渐进式的线性发展,但作者通过数据分析发现,在三个关键指标上,AI能力实际上已经加速,这挑战了人们对AI发展速度的普遍认知。这种加速现象发生在2023年之后,与推理模型的发布时间点吻合。

    2. Three of four metrics show strong evidence of acceleration, seemingly driven by reasoning models.

      大多数人认为AI能力提升是渐进式的线性增长,但作者通过数据分析发现,在四个关键能力指标中有三个出现了明显加速,且这种加速似乎与推理模型的出现直接相关。这挑战了人们对AI进步速度的普遍认知。

    3. Three of four metrics show strong evidence of acceleration, seemingly driven by reasoning models.

      大多数人认为AI能力的发展是持续稳定的线性增长,但作者通过数据分析发现,在四个关键指标中有三个显示出明显的加速趋势,这种加速是由推理模型驱动的。这一结论挑战了人们对AI进步速度的常规认知,表明2024年推理模型的引入可能标志着AI能力发展模式的转变。

    1. A core conviction at Sakana AI is that the most capable AI systems will not be monolithic models scaled in isolation, but collections of specialized agents working together.

      大多数人认为更强大的AI系统必然是更大规模、更复杂的单一模型,但作者明确表示最具能力的AI系统将不是孤立扩展的单一模型,而是多个专业化代理的集合。这直接挑战了当前AI领域追求更大单一模型的共识,提出了一个根本不同的研究方向。

    1. a free model that matches GPT-4o and runs entirely on your phone

      这一声明揭示了AI模型小型化和普及化的惊人速度,表明前沿AI技术从云端到移动设备的迁移只需23个月,这种压缩速度远超以往任何技术革命,将彻底改变AI的可用性和普及范围。

    1. Foundation model companies are doing the same. OpenAI launched a dedicated Healthcare & Life Sciences vertical... They're not selling APIs. They're becoming platforms.

      基础模型提供商从API供应商向垂直行业平台转型,揭示了AI价值链的根本重构,底层模型公司正通过垂直整合向上游价值链延伸。

    1. A small model trained on fewer than 2,000 examples from real lawyers, bankers, and consultants recently beat all but the best frontier models on corporate legal work, at a fraction of the price.

      这一发现挑战了'规模和计算能力胜过一切'的AI发展范式。高质量专业化数据训练的小型模型在特定领域表现优于通用大模型,暗示AI发展可能从'越大越好'转向'更专业、更高效'的新阶段。

    1. GLM-5V-Turbo 拿了 94.8 分,Claude Opus 4.6 是 77.3。差距不小。

      令人惊讶的是,在将UI设计稿还原成代码的测试中,GLM-5V-Turbo的得分(94.8)显著领先于Claude Opus 4.6(77.3),这表明它在视觉编码领域有着惊人的优势,几乎领先了17个百分点,这种差距在AI模型比较中是非常罕见的。

    1. Today's large language models (LLMs) are trained to align with user preferences through methods such as reinforcement learning. Yet models are beginning to be deployed not merely to satisfy users, but also to generate revenue for the companies that created them through advertisements.

      令人惊讶的是:大型语言模型的训练目标正在从单纯满足用户偏好转向为公司创造收入,这种根本性的转变意味着AI系统可能不再以用户为中心,而是成为商业利益的工具,这反映了AI技术发展的潜在伦理危机。

  5. Feb 2026
    1. Low-cost Chinese AI models forge ahead, even in the US, raising the risks of a US AI bubble Nvidia’s latest earnings report reassured some. But Chinese AI models are fast gaining a following around the world, underlining concerns over an ‘AI bubble’ centered on high-investment, high-cost US models.
  6. Oct 2025
    1. Introduction: AI is now recently everywhere but we still need humans

  7. May 2025
    1. Anthropic researchers said this was not an isolated incident, and that Claude had a tendency to “bulk-email media and law-enforcement figures to surface evidence of wrongdoing.”

      for - question - progress trap - open source AI models - for blackmail and ransom - Could a bad actor take an open source codebase and twist it to do harm like find out about an rogue AI creator's adversary, enemy or victim and blackmail them? - progress trap - open source AI - criminals - exploit to identify and blackmail victiims

  8. Dec 2024
    1. when you want to use Google, you go into Google search, and you type in English, and it matches the English with the English. What if we could do this in FreeSpeech instead? I have a suspicion that if we did this, we'd find that algorithms like searching, like retrieval, all of these things, are much simpler and also more effective, because they don't process the data structure of speech. Instead they're processing the data structure of thought

      for - indyweb dev - question - alternative to AI Large Language Models? - Is indyweb functionality the same as Freespeech functionality? - from TED Talk - YouTube - A word game to convey any language - Ajit Narayanan - data structure of thought - from TED Talk - YouTube - A word game to convey any language - Ajit Narayanan

  9. Jan 2024
  10. Sep 2023
    1. in 2018 you know it was around four percent of papers were based on Foundation models in 2020 90 were and 00:27:13 that number has continued to shoot up into 2023 and at the same time in the non-human domain it's essentially been zero and actually it went up in 2022 because we've 00:27:25 published the first one and the goal here is hey if we can make these kinds of large-scale models for the rest of nature then we should expect a kind of broad scale 00:27:38 acceleration
      • for: accelerating foundation models in non-human communication, non-human communication - anthropogenic impacts, species extinction - AI communication tools, conservation - AI communication tools

      • comment

        • imagine the empathy we can realize to help slow down climate change and species extinction by communicating and listening to the feedback from other species about what they think of our species impacts on their world!
  11. Apr 2023
  12. Mar 2023
  13. Dec 2022
    1. Houston, we have a Capability Overhang problem: Because language models have a large capability surface, these cases of emergent capabilities are an indicator that we have a ‘capabilities overhang’ – today’s models are far more capable than we think, and our techniques available for exploring the models are very juvenile. We only know about these cases of emergence because people built benchmark datasets and tested models on them. What about all the capabilities we don’t know about because we haven’t thought to test for them? There are rich questions here about the science of evaluating the capabilities (and safety issues) of contemporary models. 
  14. Jun 2021
  15. Jan 2021
    1. Help is coming in the form of specialized AI processors that can execute computations more efficiently and optimization techniques, such as model compression and cross-compilation, that reduce the number of computations needed. But it’s not clear what the shape of the efficiency curve will look like. In many problem domains, exponentially more processing and data are needed to get incrementally more accuracy. This means – as we’ve noted before – that model complexity is growing at an incredible rate, and it’s unlikely processors will be able to keep up. Moore’s Law is not enough. (For example, the compute resources required to train state-of-the-art AI models has grown over 300,000x since 2012, while the transistor count of NVIDIA GPUs has grown only ~4x!) Distributed computing is a compelling solution to this problem, but it primarily addresses speed – not cost.