The main issue is that there just isn't really a field of research for multi-agent safety yet. And we would like there to be.
大多数人认为AI安全研究已经涵盖了多智能体系统,但作者认为这是一个全新的研究领域,表明当前AI安全研究存在明显空白。这挑战了人们对AI安全研究现状的认知,暗示了现有研究框架可能不足以应对即将到来的多智能体交互挑战。
The main issue is that there just isn't really a field of research for multi-agent safety yet. And we would like there to be.
大多数人认为AI安全研究已经涵盖了多智能体系统,但作者认为这是一个全新的研究领域,表明当前AI安全研究存在明显空白。这挑战了人们对AI安全研究现状的认知,暗示了现有研究框架可能不足以应对即将到来的多智能体交互挑战。
the robustness of these reasoning behaviors remains underexplored
「推理行为的鲁棒性尚未被充分探索」——这句话是整个推理模型研究领域的集体盲点声明。过去两年,测试时计算(test-time compute)、长思维链(CoT)、o1/R1 类推理模型吸引了巨大关注,但几乎所有评测都在「孤立问题」环境下进行。在真实 Agent 部署场景中,「能否保持推理深度」这个最基本的可靠性问题,直到这篇论文才开始被系统研究。
Racine, N., Madigan, S., Cardinal, S., Hartwick, C., Leslie, M., Motz, M., & Pepler, D. (2021). Community-Based Research: Perspectives of Psychology Researchers and Community Partners. PsyArXiv. https://doi.org/10.31234/osf.io/cxrmt
At the time of the beginning of the research, very little had been written on middle- aged women; collectively as social scientists we knew next to nothing about the middle years of adult life. We were critical of what little literature existed and were skeptical of widely held assumptions about women of this age.
Social science literature absent the experience of middle-aged women. Interregate empty next syndrome.
Hopes UK trial will allay pregnant women’s Covid vaccine concerns. (2021, August 3). The Guardian. http://www.theguardian.com/world/2021/aug/03/hopes-uk-trial-will-allay-pregnant-womens-covid-vaccine-concerns
Hammerstein, S., König, C., Dreisoerner, T., & Frey, A. (2021). Effects of COVID-19-Related School Closures on Student Achievement—A Systematic Review [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/mcnvk
Maxmen, A. (2021). Will COVID force public health to confront America’s epic inequality?. Nature, 592(7856), 674-680.