Hypothesis

7 Matching Annotations

May 2026
www.anthropic.com www.anthropic.com

https://www.anthropic.com/research/glasswing-initial-update

1
1. fxp007 22 May 2026
  
  in Public
  
  90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity
  
  这两个百分比数据点(90.6%验证率，62.4%确认高危率)对于评估AI模型在安全漏洞检测中的可靠性至关重要。90.6%的验证率表明AI模型的误报率相对较低，这在AI安全领域是相当出色的表现。然而，62.4%的确认高危率意味着近40%的AI评估高危漏洞实际严重程度较低，这反映了AI在严重性评估上仍有改进空间。
  
  data-point accuracy-metrics ai-reliability
Visit annotations in context

Tags

accuracy-metrics

ai-reliability

data-point

Annotators

fxp007

URL

anthropic.com/research/glasswing-initial-update
www.anthropic.com www.anthropic.com

Natural Language Autoencoders

1
1. fxp007 15 May 2026
  
  in Public
  
  The NLA consists of the AV and AR, which, together, form a round trip: original activation → text explanation → reconstructed activation. We score the NLA on how similar the reconstructed activation is to the original.
  
  NLA通过激活解释器(AV)和激活重构器(AR)形成闭环，通过重构质量评估解释准确性，这种创新方法为AI内部表示的可解释性提供了新范式。
  
  AI architecture reconstruction accuracy
Visit annotations in context

Tags

AI architecture

reconstruction accuracy

Annotators

fxp007

URL

anthropic.com/research/natural-language-autoencoders
arstechnica.com arstechnica.com

https://arstechnica.com/staff/2026/04/our-newsroom-ai-policy/

1
1. fxp007 01 May 2026
  
  in Public
  
  We do not publish AI-generated images, audio, or video as authentic documentation of real events.
  
  这条规定指出Ars Technica不会将人工智能生成的图像、音频或视频作为真实事件的证明，体现了对真实性的坚持。
  
  ai-generated-content authenticity-standards media-accuracy
Visit annotations in context

Tags

authenticity-standards

ai-generated-content

media-accuracy

Annotators

fxp007

URL

arstechnica.com/staff/2026/04/our-newsroom-ai-policy/
Apr 2026
www.microsoft.com www.microsoft.com

https://www.microsoft.com/en-us/research/blog/adele-predicting-and-explaining-ai-performance-across-tasks/

1
1. fxp007 16 Apr 2026
  
  in Public
  
  Using these ability scores, the method predicts performance on new tasks with ~88% accuracy, including for models such as GPT-4o and Llama-3.1.
  
  令人惊讶的是：ADeLe方法能够以约88%的准确度预测AI模型在新任务上的表现，这包括像GPT-4o和Llama-3.1这样先进的大模型。这种预测能力远超传统评估方法，为AI性能评估提供了革命性的突破，使研究人员能够更可靠地预见模型在未见过的任务上的表现。
  
  surprising ai-performance prediction-accuracy
Visit annotations in context

Tags

ai-performance

prediction-accuracy

surprising

Annotators

fxp007

URL

microsoft.com/en-us/research/blog/adele-predicting-and-explaining-ai-performance-across-tasks/
reducto.ai reducto.ai

https://reducto.ai/blog/reducto-deep-extract-agent

1
1. fxp007 08 Apr 2026
  
  in Public
  
  We've seen customers go from 10-20% field accuracy with a frontier model to 99-100% just by switching to using Reducto's Deep Extract.
  
  大多数人认为从前沿模型到接近完美的准确率需要根本性的技术突破或大量数据训练。但作者声称仅通过切换到Deep Extract方法就能将准确率从10-20%提升到99-100%，这种巨大性能提升的幅度与行业通常预期的改进曲线相悖，暗示现有方法可能存在根本性缺陷。
  
  non-consensus performance-improvement ai-accuracy
Visit annotations in context

Tags

performance-improvement

ai-accuracy

non-consensus

Annotators

fxp007

URL

reducto.ai/blog/reducto-deep-extract-agent
Mar 2021
twitter.com twitter.com

Tweet / Twitter

1
1. NatasjaDerbyMcCabe 02 Mar 2021
  
  in BehSci
  
  ReconfigBehSci. (2020, November 9). Session 2: The policy interface followed with a really helpful presentation by Lindsey Pike, from Bristol, and then panel discussion with Mirjam Jenny (Robert Koch Insitute), Paulina Lang (UK Cabinet Office), Rachel McCloy (Reading Uni.), and Rene van Bavel (European Commission) [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1325795286065815552
  
  is:tweet lang:en policy interface accuracy trust transparency bias AI support tools tension close science-policy science-general public trust UK University of Bristol University of Reading European Commission
Visit annotations in context

Tags

University of Bristol

European Commission

close science-policy

UK

AI support tools

trust

science-general

accuracy

transparency

public trust

is:tweet

policy interface

lang:en

tension

bias

University of Reading

Annotators

NatasjaDerbyMcCabe

URL

twitter.com/scibeh/status/1325795286065815552
Sep 2020
psyarxiv.com psyarxiv.com

Unifying recommendation and active learning for information filtering and recommender systems

1
1. katietaylor_99 07 Sep 2020
  
  in BehSci
  
  Yang, Scott Cheng-Hsin, Chirag Rank, Jake Alden Whritner, Olfa Nasraoui, and Patrick Shafto. ‘Unifying Recommendation and Active Learning for Information Filtering and Recommender Systems’. Preprint. PsyArXiv, 25 August 2020. https://doi.org/10.31234/osf.io/jqa83.
  
  is:preprint lang:en active learning information filtering recommender system algorithms Internet AI artificial intelligence machine learning predictive accuracy recommendation accuracy exploration-exploitation tradeoff parameterized model cognitive science computer science experimental approach
Visit annotations in context

Tags

parameterized model

cognitive science

is:preprint

artificial intelligence

recommender system

recommendation accuracy

AI

information filtering

lang:en

computer science

Internet

machine learning

exploration-exploitation tradeoff

experimental approach

predictive accuracy

active learning

algorithms

Annotators

katietaylor_99

URL

psyarxiv.com/jqa83/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL