Hypothesis

15 Matching Annotations

Last 7 days
epoch.ai epoch.ai

RIP Classic Reasoning Benchmarks. What's Next? - Epoch AI

1
1. fxp007 07 May 2026
  
  in Public
  
  humans can do this in well under half an hour.
  
  人类能在半小时内完成IKEA家具组装任务，而AI系统仅达到40%的准确率，这一对比突显了AI在需要实际操作理解的任务上与人类的显著差距。时间效率的差异也强调了基准测试中时间维度的重要性。
  
  data-point human-baseline time-efficiency
Visit annotations in context

Tags

time-efficiency

data-point

human-baseline

Annotators

fxp007

URL

epoch.ai/gradient-updates/rip-classic-benchmarks
subq.ai subq.ai

https://subq.ai/introducing-subq

2
1. fxp007 07 May 2026
  
  in Public
  
  SubQ Sparse Attention is 52× faster than FlashAttention in our architecture-level comparison, while requiring 63% less compute.
  
  SubQ稀疏注意力比FlashAttention快52倍，同时减少63%的计算需求。这是一个显著的性能优势数据，表明SubQ在架构层面实现了重大突破，不仅提升了速度，还大幅降低了计算成本。
  
  data-point performance efficiency
2. fxp007 07 May 2026
  
  in Public
  
  With a research result at 12 million tokens, SubQ's architecture reduces attention compute by almost 1,000x compared to other frontier models.
  
  这是一个惊人的性能提升数据，SubQ架构将注意力计算减少了近1000倍，同时支持1200万token的上下文。这个数据点极具说服力，表明SubQ在计算效率方面实现了数量级的突破，远超现有前沿模型。
  
  data-point performance efficiency
Visit annotations in context

Tags

efficiency

performance

data-point

Annotators

fxp007

URL

subq.ai/introducing-subq
epoch.ai epoch.ai

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job

1
1. fxp007 07 May 2026
  
  in Public
  
  Overall, it usually takes me about two hours to do this task. If only it were as simple as a single copy and paste, life would be so much easier — or so I thought.
  
  作者完成文章发布任务通常需要约2小时，而AI在这一任务上表现极差。这一时间对比数据点突显了AI在看似简单任务上的局限性，支持了莫拉维克悖论的观点。然而，作者没有提供AI完成该任务的具体时间数据，这使得比较不够完整。
  
  data-point task-comparison time-efficiency
Visit annotations in context

Tags

time-efficiency

task-comparison

data-point

Annotators

fxp007

URL

epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
May 2026
huggingface.co huggingface.co

https://huggingface.co/papers/2604.19734

1
1. fxp007 01 May 2026
  
  in Public
  
  By predicting these unified tokens, it effectively leverages diverse human data to achieve state-of-the-art data efficiency and robust out-of-distribution (OOD) generalization.
  
  这一实验结果展示了UniT在利用人类数据实现高效和鲁棒泛化方面的潜力，为数据效率和泛化能力提供了新的标准。
  
  key-experiment data-efficiency
Visit annotations in context

Tags

key-experiment

data-efficiency

Annotators

fxp007

URL

huggingface.co/papers/2604.19734
anderegg.ca anderegg.ca

https://anderegg.ca/2026/04/22/llm-pricing-has-never-made-sense

1
1. fxp007 01 May 2026
  
  in Public
  
  They also have the benefits of running on hardware that’s sipping power most of the time, rather than slurping it down in massive data centres.
  
  本地LLM的优势在于它们在大多数时间消耗较少的电力，这可能会降低运营成本并减少对大型数据中心的需求。
  
  energy-efficiency data-center-reduction
Visit annotations in context

Tags

data-center-reduction

energy-efficiency

Annotators

fxp007

URL

anderegg.ca/2026/04/22/llm-pricing-has-never-made-sense
Apr 2026
openai.com openai.com

Introducing workspace agents in ChatGPT

1
1. fxp007 30 Apr 2026
  
  in Public
  
  What used to take reps 5-6 hours a week now runs automatically in the background on every deal.
  
  这是一个具体的效率提升数据，显示工作空间代理可以将销售代表每周5-6小时的工作自动化。这相当于每周节省约12.5%-15%的工作时间，是一个显著的效率提升，特别是在销售团队中。
  
  data-point efficiency productivity
Visit annotations in context

Tags

efficiency

productivity

data-point

Annotators

fxp007

URL

openai.com/index/introducing-workspace-agents-in-chatgpt/
api-docs.deepseek.com api-docs.deepseek.com

https://api-docs.deepseek.com/news/news260424

1
1. fxp007 30 Apr 2026
  
  in Public
  
  🔹 **DeepSeek-V4-Flash:** 284B total / 13B active params. Your fast, efficient, and economical choice.
  
  DeepSeek-V4-Flash的参数规模明显小于Pro版本：总参数2840亿，活跃参数130亿。参数效率比约为4.6%，略高于Pro版本。这种参数设计使其在保持性能的同时实现更快响应和更低成本，适合需要快速响应的应用场景。
  
  data-point model-parameters efficiency
Visit annotations in context

Tags

efficiency

model-parameters

data-point

Annotators

fxp007

URL

api-docs.deepseek.com/news/news260424
Jan 2023
hypothes.is hypothes.is

假设

1
1. haotianl 26 Jan 2023
  
  in Public
  
  个人学习可能取决于他人行为的主张突出了将学习环境视为一个涉及多个互动参与者的系统的重要性
  
  When it comes to learning context, what reminds me is the personalized learning context theory. Stephen Dowens (2010) pointed out that the learning context is a loose collection of learners, tools, resources and services, which is also a new form of the network power utilization. In a personalized learning context, there is undoubtedly that learners are the main body who participating in the teaching and learning activities. We can assume that in a passive process like listening to instructor’s point without learner’s interaction, it’s hard for learners to improve their creativity and learning efficiency. Many online learning environment designers create discussion forums in the learning system to record learners' interactions with other leaners, such as questions they ask and the responses to others' questions. The system can capture learners' study related data, analyze and assess their cognitive levels using algorithms such as the Proficiency Model.
Visit annotations in context

Tags

When it comes to learning context, what reminds me is the personalized learning context theory. Stephen Dowens (2010) pointed out that the learning context is a loose collection of learners, tools, resources and services, which is also a new form of the network power utilization. In a personalized learning context, there is undoubtedly that learners are the main body who participating in the teaching and learning activities. We can assume that in a passive process like listening to instructor’s point without learner’s interaction, it’s hard for learners to improve their creativity and learning efficiency. Many online learning environment designers create discussion forums in the learning system to record learners' interactions with other leaners, such as questions they ask and the responses to others' questions. The system can capture learners' study related data, analyze and assess their cognitive levels using algorithms such as the Proficiency Model.

Annotators

haotianl

URL

hypothes.is/groups/85b1vJWn/educ6144-001
Aug 2020
osf.io osf.io

Metascience as a scientific social movement

1
1. katietaylor_99 11 Aug 2020
  
  in BehSci
  
  Peterson, David, and Aaron Panofsky. ‘Metascience as a Scientific Social Movement’. Preprint. SocArXiv, 4 August 2020. https://doi.org/10.31235/osf.io/4dsqa.
  
  is:preprint lang:en metascience reproducibility crisis integrity communication science policy strategy statements quantification experimentation improve efficiency efficiency open science science governance preprint data repositories Big Science
Visit annotations in context

Tags

efficiency

preprint

science governance

strategy statements

reproducibility crisis

data repositories

communication

improve efficiency

is:preprint

lang:en

metascience

science policy

quantification

open science

integrity

Big Science

experimentation

Annotators

katietaylor_99

URL

osf.io/preprints/socarxiv/4dsqa/
Sep 2019
lateraleconomics.com.au lateraleconomics.com.au

Untitled document

1
1. cpsupolicyresearch 24 Sep 2019
  
  in Public
  
  Cost reduction suggestion
  
  there may be ways to reduce costs associated with the development of Census-equivalent statistics, including relying less on the general public to answer questions every five years
  
  ABS Census Efficiency Data 2019 Lateral Economics Cost reduction
Visit annotations in context

Tags

ABS

Cost reduction

Lateral Economics

Census

Data

Efficiency

2019

Annotators

cpsupolicyresearch

URL

lateraleconomics.com.au/wp-content/uploads/LE-Census-Report-ABS-Full-19-Sept.pdf
May 2014
www-group.slac.stanford.edu www-group.slac.stanford.edu

Untitled document

1
1. aculich 20 May 2014
  
  in Public
  
  SSPP # 7.2 Power Usage Effectiveness (PUE) (Electronic Maximum annual weighted average PUE of 1.4 by FY15 )
  
  SLAC target PUE of 1.4 by FY15
  
  PUE SLAC data center efficiency data centers
Visit annotations in context

Tags

PUE

data center efficiency

SLAC

data centers

Annotators

aculich

URL

www-group.slac.stanford.edu/fac/docs/SLAC_Sustainability_Plan_FY13.pdf
blogs.berkeley.edu blogs.berkeley.edu

Untitled document

1
1. aculich 20 May 2014
  
  in Public
  
  Google’s ultra-efficient data centers, with a PUE of 1.12, are beating the PUE curve by miles.
  
  Google's PUE is 1.12
  
  PUE data center efficiency data centers Google
Visit annotations in context

Tags

Google

PUE

data center efficiency

data centers

Annotators

aculich

URL

blogs.berkeley.edu/2014/04/01/how-bit-met-watt/comment-page-1/
www.i2sl.org www.i2sl.org

Untitled document

1
1. aculich 20 May 2014
  
  in Public
  
  When the project is complete later this year (all done while the existing data center remained in operation!), the data center's annual PUE will drop from 1.5 to 1.2, saving 20 percent of its annual electrical cost.
  
  Warren Hall target efficiency: 1.2 as of 2011
  
  UCBerkeley PUE data center efficiency data centers
Visit annotations in context

Tags

PUE

data center efficiency

data centers

UCBerkeley

Annotators

aculich

URL

i2sl.org/labs21/conference/2011/abstracts/e6_soladay_1.html
www.mghpcc.org www.mghpcc.org

Untitled document

1
1. aculich 20 May 2014
  
  in Public
  
  The MGHPCC is targeting a PUE of less than 1.3. A recent report cites typical data center PUEs at 1.9. This means that our facility can expect to
  
  Target of 1.3 (vs typical data centers around 1.9) PUE
  
  MGHPCC PUE data center efficiency data centers
Visit annotations in context

Tags

PUE

data center efficiency

MGHPCC

data centers

Annotators

aculich

URL

mghpcc.org/about/what-are-the-green-design-aspects-of-the-mghpcc/