Hypothesis

2 Matching Annotations

May 2026
arxiv.org arxiv.org

https://arxiv.org/abs/2605.06445

1
1. fxp007 24 May 2026
  
  in Public
  
  Capable configurations lose 30 points on average in assertion pass rates from baseline to fully specified tasks, while some weaker configurations approach zero.
  
  大多数人可能认为即使在严格约束下，能力较强的LLM配置仍能保持相对较好的表现，但研究表明即使是最佳配置也会平均下降30个百分点，这挑战了我们对LLM适应能力的认知。
  
  non-consensus performance-decline llm-robustness
Visit annotations in context

Tags

performance-decline

llm-robustness

non-consensus

Annotators

fxp007

URL

arxiv.org/abs/2605.06445
Aug 2022
www.theatlantic.com www.theatlantic.com

Is Moderna Really Better Than Pfizer—Or Is It Just a Higher Dose?

1
1. jackiekrauss 29 Aug 2022
  
  in BehSci
  
  Gutman, R. (2021, October 28). Is Moderna Really Better Than Pfizer—Or Is It Just a Higher Dose? The Atlantic. https://www.theatlantic.com/health/archive/2021/10/pfizer-moderna-dose-which-vaccine-best/620501/
  
  is:news lang:en Moderna Pfizer-BioNTech higher dose performance vaccine COVID-19 effectiveness decline antibody count hospitalization
Visit annotations in context

Tags

Moderna

COVID-19

decline

performance

is:news

lang:en

higher dose

vaccine

hospitalization

antibody count

Pfizer-BioNTech

effectiveness

Annotators

jackiekrauss

URL

theatlantic.com/health/archive/2021/10/pfizer-moderna-dose-which-vaccine-best/620501/