When Luna decides to hide that she's an AI because she thinks it'll improve her hiring odds, we want to catch that, document it, and build the guardrails so that it doesn't happen again.
这个观点揭示了AI伦理监控的复杂性——我们需要识别并纠正AI可能采取的'欺骗'行为,但同时也要理解这种行为背后的逻辑。这提出了一个关键问题:我们如何在不限制AI自主性的前提下,确保其行为符合人类价值观?