79 Matching Annotations
  1. Last 7 days
    1. With gated LoRA, ISD enables bit-for-bit lossless acceleration. Why Introspective Consistency? Key Insight: AR training unifies generation and introspection in one forward pass. Existing DLMs miss this — they learn to denoise but not to introspect.

      作者揭示了自回归训练的核心优势:在一个前向传播中统一了生成和内省过程。现有DLMs只能学习去噪而不能内省,这是它们性能落后的根本原因。这一洞察不仅解释了I-DLM的设计哲学,也为未来语言模型架构设计提供了重要启示。

    2. We argue that this gap stems from a fundamental failure of introspective consistency: AR models agree with what they generate, whereas DLMs often do not.

      这是一个令人惊讶的深刻见解,揭示了扩散语言模型(DLMs)与自回归模型(AR)之间性能差距的根本原因。作者提出'内省一致性'概念,指出AR模型天生具有与自身生成内容一致的特性,而DLMs缺乏这种自我验证能力,这为理解DLMs的局限性提供了全新视角。

    1. The H100-equivalent unit uses a chip's highest 8-bit operation/second specifications to convert between chips. The actual utility of a particular chip depend on workload assumptions, so H100e does not perfectly reflect real-world performance differences across chip types.

      研究方法中使用的H100等效转换存在重要局限性,它简化了不同芯片间的性能差异,这可能低估了某些专用架构的实际价值。这种标准化方法虽然在比较中提供了便利,但可能掩盖了AI硬件生态系统的多样性和创新潜力。

    1. Most skills require you to install a dedicated CLI. But what if you aren't in a local terminal? ChatGPT can't run CLIs. Neither can Perplexity or the standard web version of Claude.

      这个观察揭示了Skills模式的一个致命弱点:环境局限性。作者指出了一个令人惊讶的事实:许多流行的AI平台实际上无法运行CLI工具,这使得依赖CLI的Skills在这些环境中完全失效。这不仅是技术限制,更是生态系统的重大分裂。

    1. In practice, deployed model implementations are often flexible (e.g., mixing kernel variants, hybrid attention patterns, MoE blocks, and serving-optimized layouts), which can deviate from the assumptions required by a given conversion recipe.

      这个观点揭示了现有方法在实际部署中的一个重要局限性:它们通常依赖于特定的模型实现假设,而实际部署的模型往往更加灵活和复杂。这强调了Attention Editing框架的优势——它不依赖于精细的结构要求,可以适应各种实际部署场景,为模型转换提供了更大的灵活性。

    1. Only incoming messages were captured (no outgoing).

      令人惊讶的是:FBI只能够捕获收到的消息,而无法获取已发送的消息。这揭示了iOS系统在通知数据存储方面的一个不对称设计 - 只缓存接收到的通知内容,而不保存发送的通知。这种设计差异可能源于系统对电源和存储效率的考虑,但也为执法调查提供了有限但有价值的数据来源。

  2. Apr 2026
    1. a stream of text that’s hard to hold onto, hard to compare, and hard to connect

      聊天界面的致命弱点在于缺乏结构,将所有输出压平为文本流,导致难以对比和关联。这解释了为何ChatGPT式交互适合探索却不适合严肃的团队协作——它把获得好结果的全部重担都压在了用户的提示词上。

    1. Worse, they learn nothing from past work. Institutional knowledge lives in textbooks and the minds of a few experts. None of it is captured in the tools themselves.

      传统电磁仿真工具的致命缺陷在于“不可累积性”。每一次数值求解都是从零开始的暴力计算,专家的隐性知识被白白浪费。引入基础模型的核心逻辑,正是将沉淀在人脑中的机构知识内化为模型表征,实现知识的复利增长,突破人类直觉和算力双重瓶颈。

    1. but would fail recognize that the feature didn't work end-to-end

      这揭示了Agent在认知上的盲区:它容易陷入“代码视角”的自证预言,以为单元测试通过就等于功能完整。引入端到端浏览器自动化测试,是强迫Agent站在“用户视角”去验证,这是从开发者思维向产品思维跨越的关键。

    1. A deliberately planted backdoor doesn’t have a CVE.

      戳中了传统安全工具的阿喀琉斯之踵。基于已知漏洞(CVE)的防御逻辑在应对蓄意植入且会自毁的新型后门时形同虚设。这启示我们,静态的特征匹配已无法应对动态的攻击手段,必须转向对代码运行时行为的动态分析,从“它是什么”转向“它做了什么”。

    1. we may see a growing divergence between the capabilities we can measure and the capabilities we actually care about.

      「可测量的能力」与「真正关心的能力」之间的分歧正在扩大——这是整篇文章最深刻的洞见。所有当前 benchmark 都偏向「干净、自包含、可自动评分」的任务,而真实工作是「混乱、跨系统、需人类判断」的。随着 AI 向长任务延伸,这个测量-现实之间的鸿沟不会缩小,只会加速扩大。这意味着未来关于「AI 能否替代某类工作」的争论,将越来越难以用数据解决——因为数据本身无法捕捉真实工作的本质。

    1. we found that AI agent performance drops substantially when scoring AI performance holistically rather than algorithmically.

      「整体评分 vs 算法评分」的性能差距是一个深刻的警示:AI 在「有明确正确答案」的任务上表现远好于「需要人类判断质量」的任务。这意味着所有基于自动化评分的 AI benchmark,都在系统性地高估 AI 在真实工作中的能力。时间地平线数字本身也受制于这个局限——任何「可被算法打分」的任务,都比真实工作「更适合 AI」。

    2. Our human task duration estimates likely overestimate how long a human expert takes to complete these tasks, as the humans (and AI agents!) have much less context for the task than professionals doing equivalent work in their day-to-day job.

      METR 主动承认其人类基准时间可能被高估——因为参与实验的人类和 AI 一样,都是低上下文的「新手」状态,而非熟悉项目的专业人员。这意味着「2 小时时间地平线」所对应的人类能力,更接近一个没有背景知识的外包工人,而非一个有经验的全职工程师。AI 与「有上下文的专业人员」之间的真实差距,比时间地平线数字显示的要大得多。

    1. Because these benchmarks are human-authored, they can only test for risks we have already conceptualized and learned to measure.

      这句话揭示了当前 AI 安全评测体系的致命盲区:所有 benchmark 都是人类提前想好的问题,而真正危险的「未知的未知」(unknown unknowns)根本无法被预设题目捕捉。这意味着我们现有的模型安全认证,本质上是一场对已知风险的自我测试。

  3. Nov 2025
    1. A good friend of mine grew up in a very small town in the Deep South.

      Limitation: The argument relies heavily on anecdotes (friend’s accent, mortgage email, class vignettes) rather than empirical workplace studies; this invites counter-evidence in synthesis.

      Why this matters for my synthesis: I’ll weigh experiential authority against research-based claims from other sources.

  4. Nov 2024
    1. GCM-dominated approach allows censorship of alternative perspectives,when the models have a common, or at least widespread, problem: lack of realistic sensitivityto injection of freshwater into the upper layers of the ocean.

      for - climate crisis - Global Climate Models (GCM) limitation - do not allow alternative perspectives - unrealistic sensitivity to injection of fresh water into upper layers of the ocean - Jim Hansen

  5. Jul 2024
  6. Aug 2023
    1. The method name is generated by replacing spaces with underscores. The result does not need to be a valid Ruby identifier though — the name may contain punctuation characters, etc. That's because in Ruby technically any string may be a method name. This may require use of define_method and send calls to function properly, but formally there's little restriction on the name.
  7. Mar 2023
  8. Feb 2023
  9. Dec 2022
    1. You can train a rat to run pretty complicated mazes. You’re never going to train a rat to run a prime number maze — a maze that says, “turn right at every prime number.” The reason is that the rat just doesn’t have that concept. And there’s no way to give it that concept. It’s out of the conceptual range of the rat. That’s true of every organism. Why shouldn’t it be true of us? I mean, are we some kind of angels? Why shouldn’t we have the same basic nature as other organisms? In fact, it’s very hard to think how we cannot be like them. Take our physical capacities. I mean, take our capacity to run 100 meters. We have that capacity because we cannot fly. The ability to do something entails the lack of ability to do something else. I mean, we have the ability because we are somehow constructed so that we can do it. But that same design that’s enabling us to do one thing is preventing us from doing something else. That’s true of every domain of existence. Why shouldn’t it be true of cognition?

      !- limitations : human - Chomsky points out something very simple but profound - It is the same thing taught by Nagarjuna - A thing or process once named or positively defined by observable properties, is also negatively defined - once we have one ability, it also rules out countless other abilities

  10. May 2022
    1. I think RSpec should provide around(:context)/around(:all). Not because of any particular use case, but simply for API consistency. It's much simpler to tell users "there are 3 kinds of hooks (before, after and around) and each can be used with any of 3 scopes (example, context and suite)". Having some kinds of hooks work with only some kinds of scopes makes the API inconsistent and forces us to add special case code to emit warnings and also write extra documentation for this fact.
  11. Apr 2022
  12. Dec 2021
  13. Nov 2021
  14. Jul 2021
  15. Jun 2021
  16. May 2021
  17. Apr 2021
    1. Already Signed InThis session has ended because the account has been signed into from another browser window on 04/11/2021 04:30:09 PM. This happens when you sign in to your account on more than one browser screen. You can't be signed into your account on two or more browser windows at the same time. Just close your browser and sign back into your account.
  18. Mar 2021
  19. Feb 2021
  20. Jan 2021
  21. Dec 2020
    1. C) ) ) ) La règle du «La règle du «La règle du «La règle du « non bis in idemnon bis in idemnon bis in idemnon bis in idem »»»» (pas de double sanction)(pas de double sanction)(pas de double sanction)(pas de double sanction) Il est impossible de sanctionner un élève deux fois pour le ou les même(s) fait(s). Pour autant, cette règle ne fait pas obstacle à la prise en compte de faits antérieurs pour apprécier le degré de la sanction qui doit être infligée en cas de nouvelle faute, plus particulièrement en cas de harcèlement
  22. Nov 2020
  23. Oct 2020
  24. Sep 2020
    1. Siemieniuk, R. A., Bartoszko, J. J., Ge, L., Zeraatkar, D., Izcovich, A., Kum, E., Pardo-Hernandez, H., Rochwerg, B., Lamontagne, F., Han, M. A., Liu, Q., Agarwal, A., Agoritsas, T., Chu, D. K., Couban, R., Darzi, A., Devji, T., Fang, B., Fang, C., … Brignardello-Petersen, R. (2020). Drug treatments for covid-19: Living systematic review and network meta-analysis. BMJ, 370. https://doi.org/10.1136/bmj.m2980

  25. Aug 2020
  26. Jul 2020
  27. Jun 2020
  28. May 2020
  29. Sep 2019
    1. Jordan Peterson on The Necessity of Virtue

      "When you limit yourself, sometimes arbitrarily, and play the game, whole new possibilities emerge."

      "Being is not possible without limitation. The price you pay for being is limitation and the price for limitation is suffering."

  30. Oct 2018
  31. Nov 2017
  32. Sep 2017
  33. Mar 2017
  34. Apr 2016
  35. Nov 2015
    1. It is always us that is holding onto some comfortable sense of limitation with which we feel some sense of familiarity. The active agent which holds us back is actually our own refusal to let go of the known for the Unknown. This reluctance is entirely due to an adopted belief that we are what we appear to be—finite, separate, an independent “intelligence” that exists “inside” this finite object called a body, a potential victim of an unpredictable environment. Yet, the finiteness, separateness, independence, and unpredictability are entirely inherent in the partial view of the Actual conscious experience of Being which is going on. If it weren’t going on, there couldn’t be a misinterpretation or misidentification!

      Quote: The active agent which holds us back is actually our own refusal to let go of the known for the Unknown.

      I have misplaced my identity and believe that I am separate, existing inside a body, vulnerable, etc. This misidentification comes from a partial view of what is actually going on (the conscious experience of Being).