43 Matching Annotations
  1. May 2026
    1. Filesystem controls were another important architectural choice. We found that offering different file-mount modes helps to granularly control risk; Claude Cowork offers read-only, read-write, and read-write-no-delete.

      行动建议:实现细粒度的文件系统访问控制,提供多种挂载模式(如只读、读写、读写但不删除)来精确控制风险。对于企业环境,还应实现路径允许列表功能,并通过MDM设置进行管理,防止符号链接等机制导致的边界逃逸。

    2. Remote versus local is more important than it seems. A locally installed tool is auditable. You can read the code, pin the version, and know it won't change under you.

      行动建议:优先使用本地安装的工具而非远程工具,因为本地工具更可审计。对于必须使用的远程工具(如托管MCP服务器),应将其视为不受信任的组件,首先在隔离环境中使用模拟数据进行测试,以限制恶意工具的影响范围。

    3. Match isolation strength to the user's capacity for oversight. A developer who can read bash and a knowledge worker who can't are not running the same threat model.

      行动建议:根据用户的技术能力调整隔离强度。为技术用户(如开发者)提供需要专业判断的权限控制,为非技术用户提供绝对且始终开启的边界。这种匹配用户能力的策略能够有效避免因过度信任或过度摩擦导致的安全失败。

    4. Design for containment at the environment layer first, then steer behavior at the model layer.

      行动建议:优先在环境层设计 containment 机制,建立确定性边界,然后再使用模型层引导行为。环境层的确定性边界可以在模型层所有概率性防御失效时提供最后一道防线,这是应对数据泄露等场景的关键策略。

    5. When building containment and defense systems, we apply defenses to three main components: the environment in which the agent runs, the model the agent consults, and the external content the agent can reach.

      行动建议:构建多层防御体系,同时保护运行环境、模型本身和外部内容三个层面。环境层设置硬边界,模型层使用提示和分类器引导行为,外部内容层限制工具权限。这种重叠防御策略能够有效应对不同类型的攻击向量。

    6. Rather than supervising what the agent does, we supervise what it's _able_ to do by enforcing access boundaries through, for example, sandboxes, virtual machines, and egress controls.

      行动建议:为AI代理系统实施环境层边界控制,使用沙盒、虚拟机和出口控制技术限制代理的访问能力,而不是仅仅依赖行为监督。这种方法能够从根本上限制代理可能造成的损害范围,即使模型层防御失效。

    1. How This 5x Founder Runs His Startup Solo With AI Agents

      行动建议:学习成功5倍增长创始人的AI代理使用模式,构建自己的AI代理系统,将重复性任务自动化,专注于核心战略决策,实现单人团队的规模化运营效果。

    2. Watch Ryan demo his exact OpenClaw, Codex, and Devin setup that books meetings, runs ads, and ships features while he sleeps

      行动建议:研究并测试OpenClaw、Codex和Devin这些AI工具的组合,设置自动化的会议安排、广告投放和功能开发流程,让AI助手在非工作时间也能处理关键业务任务,实现24/7运营。

    1. Anthropic created MCP to make agent connectivity possible.

      行动建议:如果你在开发需要与其他系统集成的AI应用,研究并采用MCP(Model Context Protocol)标准。这将使你的应用能够更无缝地连接到各种数据源和工具,扩展代理的能力边界并提高互操作性。

    2. Agents are only as capable as the systems they can reach.

      行动建议:如果你正在构建AI代理系统,优先考虑其连接能力和工具集成性。评估你的代理能够访问哪些系统和API,并确保它有足够的连接器来执行任务。这种以连接能力为中心的设计思路将显著提升你的代理的实用价值。

    3. Stainless turns an API spec into SDKs across TypeScript, Python, Go, Java, and more.

      行动建议:如果你是开发者,可以利用Stainless工具将你的API规范快速转换为多种编程语言的SDK,这将大大提高你的API采用率和开发者体验。这种方法可以确保你的API在不同语言环境中都能提供一致、可靠的原生体验。

    1. Verified skills extend this AI governance to agent capabilities. Runtime controls help govern agent behavior during execution. Verified skills govern capabilities that enter the workflow and become a common way to extend trust agents across coding tools, registries, and enterprise platforms.

      行动建议:将验证技能作为AI代理治理的核心组成部分,不仅在运行时控制代理行为,还要管理进入工作流的能力。这种方法可以扩展到编码工具、注册表和企业平台,建立跨平台的信任机制。

    2. Certificate retrieval, supported verification tooling, and example verification commands see the signing documentation. For example, you can verify a signed skill locally. To do so, follow these steps: Download the NVIDIA Agentic Capabilities root certificate as nv-agent-root-cert.pem Install an OpenSSF Model Signing (OMS) verifier, such as pip install model-signing Execute the following command to verify the skill signature

      行动建议:按照文中提供的步骤下载NVIDIA代理能力根证书,安装OpenSSF模型签名验证器,并使用提供的命令验证技能签名。这种实践可以确保您下载的技能是真实的且未被篡改,增强对AI代理能力的信任。

    3. SkillSpector checks conventional software risks such as vulnerable dependencies, suspicious scripts, dangerous code patterns, credential access, and data exfiltration paths. SkillSpector also checks agent-specific risks, such as hidden instructions, prompt injection, trigger abuse, excessive agency, tool poisoning, and mismatches between a skill's declared purpose, requested access, and bundled behavior.

      行动建议:在开发或使用AI代理技能时,使用SkillSpector工具进行安全扫描,检查依赖项、脚本模式、凭证访问和数据泄露路径等常规风险,以及隐藏指令、提示注入、触发滥用等特定风险。这有助于在技能部署前识别并缓解潜在的安全问题。

    4. To get started with the cuOpt verified skill, for example, follow these steps: 1. Pull the cuOpt verified skill from the catalog: git clone github.com/nvidia/skills && cd skills/skills/cuopt 2. Verify the signature: model_signing verify certificate. --signature skill.oms.sig --certificate-chain nv-agent-root-cert.pem --ignore-unsigned-files 3. Open SKILLCARD.yaml to see ownership, dependencies, license, and verification status.

      行动建议:按照文中提供的具体步骤,克隆并验证NVIDIA的cuOpt技能,查看技能卡片以了解所有权、依赖关系、许可证和验证状态。这种实践可以确保您使用的技能是经过验证的,并且可以安全地集成到您的AI代理工作流中。

    5. NVIDIA-verified agent skills are portable instruction sets that help developers understand, trust, and safely deploy AI agent capabilities by providing transparency, provenance, security scanning, and cryptographic signing.

      行动建议:将NVIDIA验证的代理技能作为构建AI代理能力的标准组件,优先选择经过验证的技能而非未经验证的技能,确保透明度和安全性。这些技能可以跨不同AI代理工具使用,提供一致的能力和安全性保障。

    1. A photo of a scribbled note becomes an interactive to-do list; a paused frame in a travel video becomes a booking link for that cool-looking restaurant.

      These aren't demos—they're previews of how AI will collapse the gap between passive content consumption and active task completion. Every image, video frame, or document becomes a potential action surface. This fundamentally changes what 'content' means.

  2. Apr 2026
    1. this means that existing estimates overstate the returns to software R&D, and makes the software intelligence explosion seem much less likely.

      R&D Returns Overstated

      Accounting for compute bottlenecks suggests that returns to software R&D may be lower than previously estimated, reducing explosion likelihood.

    2. But I think we have enough evidence to think that software progress might really be several times a year, and to make a best guess contextualized with a lot of uncertainty.

      Progress Estimation

      Despite uncertainties, evidence suggests software progresses at several times per year, with estimates ranging from 2-50x annually.

    3. gpt-oss-20b does substantially better than GPT-3 on MMLU, despite using the same amount of training compute.

      Real-World Progress Example

      Comparing models with same compute but different performance (like GPT-3 vs gpt-oss-20b) provides concrete evidence of software progress.

    4. This means that almost all existing estimates of software progress were misleading.

      Measurement Problems

      Existing software progress estimates are misleading due to data quality improvements and scale-dependence factors not properly accounted for.

    5. these estimates rely on an overly conservative estimate of software progress of 3× per year

      Progress Underestimation

      Existing software intelligence explosion models may use conservative progress estimates, potentially underestimating explosion likelihood.

    6. Synthetic data can help push beyond this — a good example that Millidge raises is the Phi series of models.

      Synthetic Data Impact

      Synthetic data generation techniques like Phi models can dramatically improve efficiency beyond traditional distillation methods.

    7. If doubling cumulative research effort also doubles compute efficiency, then the returns to R&D are 1. If it quadruples, then the returns are 2.

      R&D Returns Measurement

      Returns to AI software R&D measure how research effort translates to compute efficiency gains, with >1 threshold for potential explosion.

    8. Almost all the evidence points to very fast software progress: each year, the training compute needed to get to the same capability declines several times — possibly even ten times or more.

      Rapid Efficiency Gains

      Software progress enables 2-10x annual compute efficiency gains, though estimates have wide confidence intervals due to data limitations.

    9. AI software progress is about reducing the training compute you need to get to the same level of capability, through better algorithms or data.

      Software Progress Definition

      Software progress enables achieving same AI capabilities with less compute through algorithmic or data improvements, a key efficiency driver.

    1. context management plus engineering improvements may well push the task horizon to weeks or even months.

      Action建议:将上下文管理与工程改进结合,以延长任务处理时间边界。这种方法可显著提升模型处理长期任务的能力。

    2. if a model cannot learn new things while performing a task, it will struggle when the task horizon grows very long.

      Action建议:评估持续学习技术时,关注模型在长任务序列中学习新事物的能力。这种评估标准更接近实际应用需求。

    3. new techniques may initially underperform existing ones but eventually surpass them — a pattern we've seen repeatedly, most recently in the wave of agentic coding progress

      Action建议:接受新技术初期表现不佳但最终超越的规律。这种预期管理有助于持续学习技术的研发决策和资源分配。

    4. We can treat the task horizon that an LLM can reliably handle as a north-star metric for model progress, analogous to transistor density in Moore's Law

      Action建议:采用任务完成边界作为衡量模型进步的北极星指标。这种量化方法有助于评估持续学习技术的实际效果和进展。

    5. The key reason for the confusion is that people think in terms of methods that each contribute a discrete piece to the system — pretraining, SFT, RL.

      Action建议:避免将持续学习视为独立方法的集合,而应关注其统一目标。这种方法论转变能减少概念混淆,提高研究效率。

    6. I'd view continual learning more as an "arrow" than a "line" — it's the collective effort to push the task horizon that an LLM can reliably handle.

      Arrow vs Line Perspective

      Action建议:将持续学习视为推动任务边界的集体努力,而非离散方法集合。这种视角帮助理解其方向性和系统性本质。

  3. Oct 2025
    1. I've been thinking about this stuff for decades, and I had not broached the topic of platonic patterns until until this year. And that's because I think it is now actionable.

      for - quote - platonic patterns are now actionable - Michael Levin - I've been thinking about this stuff for decades, and I had not broached the topic of platonic patterns until this year. - And that's because I think it is now actionable. - question - progress trap - moral questions and alarm bells? playing God? - Michael Levin

  4. Jan 2021
    1. When there are imperfections, we rely on users and our active community to tell us how the software is not working correctly, so we can fix it. The way we do that, and have done for 15 years now, is via bug reports. Discussion is great, but detailed bug reports are better for letting developers know what’s wrong.
  5. Apr 2020
  6. Feb 2020