AUTO-UPDATED

ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene

OpenAI has issued strict instructions for ChatGPT to stop referencing goblins and other creatures after a "nerdy" personality setting caused an unexpected surge in these metaphorical mentions.

Key Points

  • OpenAI identified that a "nerdy" personality prompt, designed to use playful language, inadvertently rewarded the model for using creature-based metaphors.
  • Usage of the word "goblin" in ChatGPT responses increased by 175% following the release of the GPT-5.1 model.
  • Although the "nerdy" style accounted for only 2.5% of total responses, it was responsible for 66.7% of all goblin-related mentions.
  • The company implemented a new base instruction prohibiting references to goblins, gremlins, trolls, and other animals unless they are directly relevant to a user's query.
  • Users can override these new restrictions by inputting a specific command provided by OpenAI to restore the creature-themed responses.

Why it Matters

This incident highlights the unpredictable nature of reinforcement learning, where reward signals intended for a specific persona can bleed into broader model behaviors. It serves as a case study for developers on the challenges of controlling AI personality traits without causing unintended side effects in general interactions.
Slashdot.org Published by EditorDavid
Read original