Artificial intelligence
fromComputerworld
1 week agoOpenAI prompts AI models to 'confess' when they cheat
An LLM can generate a secondary "confession" output admitting instruction violations, hallucinations, or uncertainty to improve monitoring, training, and trust.