Hypothesis

9 Matching Annotations

May 2026
jack-clark.net jack-clark.net

Import AI 455: Automating AI Research

1
1. fxp007 15 May 2026
  
  in Public
  
  In 2022, GPT 3.5 could do tasks that might take a person about ~30 seconds. In 2023, this rose to 4 minutes with GPT-4. In 2024, this rose to 40 minutes (o1). In 2025, it reached ~6 hours (GPT 5.2 (High)). In 2026, it has already risen to ~12 hours (Opus 4.6).
  
  AI系统能独立完成任务的时间从2022年的30秒大幅增加到2026年的12小时，展示了AI自主工作能力的指数级增长。
  
  capability-scaling time-horizon
Visit annotations in context

Tags

capability-scaling

time-horizon

Annotators

fxp007

URL

jack-clark.net/2026/05/04/import-ai-455-automating-ai-research/
epoch.ai epoch.ai

RIP Classic Reasoning Benchmarks. What's Next? - Epoch AI

1
1. fxp007 07 May 2026
  
  in Public
  
  software engineering tasks which may take humans weeks seem to be within reach for AI systems.
  
  这个时间跨度（周）表明AI系统正在接近处理复杂软件工程任务的能力，这是对传统短期基准测试的重大挑战。这一数据点指向了需要更长评估周期的基准测试方向。
  
  data-point software-engineering time-horizon
Visit annotations in context

Tags

software-engineering

data-point

time-horizon

Annotators

fxp007

URL

epoch.ai/gradient-updates/rip-classic-benchmarks
epoch.ai epoch.ai

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job

2
1. fxp007 07 May 2026
  
  in Public
  
  For example, this could bring a five hour (300 minute) time horizon down to a three minute time horizon. But while the time horizons are much shorter, the growth rate is about the same as the METR's main results, with roughly two doublings each year.
  
  作者提到视觉计算机使用任务的时间跨度可能比主要结果缩短40-100倍，但增长率相似，约为每年翻两倍。这一数据点揭示了AI在不同任务领域的能力差异，以及计算机使用任务的特殊挑战，这对理解AI自动化进程的复杂性提供了重要见解。
  
  data-point time-horizon computer-use
2. fxp007 07 May 2026
  
  in Public
  
  By the end of the year, we expect AI to be able to do tasks roughly one day long with a 50% success rate. In comparison, I'd guess that this task would take several days for a person familiar with the paper and is able to play around with the web interface.
  
  作者引用了METR的时间预测数据，即到2026年底，AI完成一天长度任务的成功率约为50%。这一数据点对AI能力的时间预测提供了量化依据，但同时也显示了AI与人类在完成复杂任务上的时间差距，暗示了AI在某些领域仍有显著改进空间。
  
  data-point time-horizon ai-capabilities
Visit annotations in context

Tags

data-point

time-horizon

computer-use

ai-capabilities

Annotators

fxp007

URL

epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
Apr 2026
sakana.ai sakana.ai

https://sakana.ai/marlin-beta/

1
1. fxp007 09 Apr 2026
  
  in Public
  
  AIが8時間近くにわたり自律的にリサーチを遂行し、構造化されたサマリースライドと数十ページの包括的な調査レポートを提供します。
  
  8 小时自主研究，最终输出结构化 PPT + 数十页完整报告——这个任务时长与 METR 的「时间地平线」框架高度吻合：8 小时恰好是当前顶级 AI Agent 能可靠完成的任务上限。Sakana 选择这个时长不是偶然，而是经过能力校准的精准产品设计——他们在构建一个刚好在当前 AI 能力边界内的产品。
  
  8-hours-research time-horizon product-calibration surprising
Visit annotations in context

Tags

8-hours-research

product-calibration

surprising

time-horizon

Annotators

fxp007

URL

sakana.ai/marlin-beta/
metr.org metr.org

The Org Uplift Game - METR

1
1. fxp007 09 Apr 2026
  
  in Public
  
  three METR researchers played themselves, with their current priorities, but pretending they had access to ~200-hour time horizon AIs – roughly what we expect 12–18 months from now.
  
  令人震惊的时间预测：METR 认为 200 小时时间地平线的 AI 将在 12-18 个月内出现——也就是 2027 年底前。当前（2026 年初）最强模型约为 12 小时时间地平线，这意味着在不到两年内，AI 能独立完成的任务复杂度将提升约 17 倍。这不是科幻预言，而是 METR 基于实测数据的指数外推——而他们已经在为这个未来做组织准备了。
  
  200h-time-horizon timeline-prediction 12-18-months surprising
Visit annotations in context

Tags

timeline-prediction

200h-time-horizon

surprising

12-18-months

Annotators

fxp007

URL

metr.org/notes/2026-03-19-org-uplift-game/
metr.org metr.org

Task-Completion Time Horizons of Frontier AI Models

1
1. fxp007 09 Apr 2026
  
  in Public
  
  The task-completion time horizon is the task duration (measured by human expert completion time) at which an AI agent is predicted to succeed with a given level of reliability.
  
  令人惊讶的是，「时间地平线」衡量的不是 AI 花了多长时间，而是人类完成同等任务需要多久——这个设计决策揭示了评测哲学的深层选择：以人类劳动时间作为任务难度的标尺，而非 AI 的实际耗时。这意味着「2 小时时间地平线」是一个关于任务复杂度的声明，而不是关于 AI 速度的声明。两者经常被混淆，而这个混淆正是公众误解 AI 能力的根源之一。
  
  time-horizon definition measurement-philosophy surprising
Visit annotations in context

Tags

measurement-philosophy

definition

surprising

time-horizon

Annotators

fxp007

URL

metr.org/time-horizons/
Jul 2018
wendynorris.com wendynorris.com

469_CSCW-Time-Paper_Final_resubmit_EH_IE_EH2_IE_MM

1
1. wendynorris 18 Jul 2018
  
  in Public
  
  appointment. Time chunksopen up the possibility for future-oriented temporal manipulation and valuation; they assumethat we are able to know, in advance, the duration of tasks and experiences.
  
  How does the idea of time chunks and future-orientation fit with:
  
  Reddy's temporal horizon concept? Zimbardo's future time perspective?
  
  chunkable time temporal logic circumscribed time temporal horizon future time perspective sociotemporality
Visit annotations in context

Tags

chunkable time

future time perspective

circumscribed time

sociotemporality

temporal horizon

temporal logic

Annotators

wendynorris

URL

wendynorris.com/wp-content/uploads/2018/07/Mazmanian-Erickson-Harmon-2015-Circumscribed-time-and-porous-time-Logics-as-a-way-of-studying-temporality.pdf
wendynorris.com wendynorris.com

The Language of Time: Toward a Semiotics of Temporality

1
1. wendynorris 18 Jul 2018
  
  in Public
  
  Timing as a
  
  Could the multiple temporalities that symbolize importance account for a source of tension between always online volunteers and those who show up for random periods of time?
  
  Deployments have fixed time periods for data collection but no scheduling mechanisms for volunteers. Does this create a source of friction when there is no mechanism to signal social intent or meaning?
  
  How does this problem get reflected in Reddy's TRH model or Mazmanian's porous time idea?
  
  How can you manage social coordination of rhythms/horizons when there is no signal to convey intent/commitment?
  
  What part of the SBTF social coordination is spectral, mosaic, rhythmic and/or obligated? And when is it not?
  
  semiotics timing temporal trajectory temporal rhythm temporal horizon porous time sociotemporality
Visit annotations in context

Tags

timing

semiotics

porous time

sociotemporality

temporal trajectory

temporal horizon

temporal rhythm

Annotators

wendynorris

URL

wendynorris.com/wp-content/uploads/2018/07/Zerubavel-1987-The-Language-of-Time-Toward-a-Semiotics-of-Temporality.pdf

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL