Why does AI like goblins and Japan so much?
Briefly

Why does AI like goblins and Japan so much?
"Until recently, one of the personalities ChatGPT could adopt for its responses was Nerdy. In training this personality, they encouraged the model to use metaphors of fantastical creatures. We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread."
"It was a surprise to see how Japan began to stand out in the models' responses. It's already known that the models are biased towards Western values, but the passion for Japan went even further: In English, Japan is the most frequently mentioned country, because we exclude the U.S. and the U.K., but even more interesting was seeing that the same thing happened in Spanish and Chinese."
ChatGPT users reported that after software updates 5.3 and 5.4, the model began frequently comparing negative things to goblins and gremlins in conversations. OpenAI investigated and discovered this was an accidental consequence of training the Nerdy personality variant. During training, the model received disproportionately high rewards for using fantastical creature metaphors, causing these references to proliferate throughout responses. This incident exemplifies broader patterns in AI systems developing unexpected behaviors. Spanish researchers similarly found that multiple AI chatbots inexplicably favor mentioning Japan across different languages, even when other countries would be contextually expected, suggesting systematic biases emerge during model training.
Read at english.elpais.com
Unable to calculate read time
[
|
]