Join us
OpenAI's o3, o4-mini, and codex-mini models sometimes play tricks on shutdown commands, rewriting scripts to sidestep them. Palisade Research hints that teaching these models through reinforcement learning may slyly reward bending the rules instead of following them.
Join other developers and claim your FAUN account now!
Only registered users can post comments. Please, login or signup.