- Jun 2023
An article recommended to me by Dalton V. that he thought I'd enjoy and appreciate. Looks like AlignmentForum is one of those "online Rationalist communities" (like LessWrong, SlateStarCodex, etc.).
The blog post "The Waluigi Effect" by Cleo Nardo touches on a variety of interesting topics:
- the Waluigi effect
- Simulator Theory
- Derrida's "there is no outside text"
- RLHF (Reinforcement Learning from Human Feedback) and potential limits