James Bach and I have developed a preliminary set of guideword heuristics for “syndromes” of undesirable behaviour in large language models — consistent patterns of problems that we have observed and can now watch for systematically.
These are not exclusive categories; they may overlap or interact with each other.
Note: our labels for these categories might seem anthropomorphic. We don’t really believe in ascribing human tendencies to machinery that generates output stochastically. But if the AI fanboys are going to claim that their large language models behave in ways that are “just like humans!”, our reply is that the behaviour is often like very dysfunctional, incompetent, and unreliable humans.
Incuriosity | Avoids asking questions; does not seek clarification. |
Incorrectness | Provides answers that are demonstrably wrong in some way (e.g. counter to known facts; math errors; based on obsolete training data; non-random responses when random responses are requested). |
Non-responsiveness | Provides answers that may not answer the question posed in the prompt. |
Negligence / Laziness | Gives answers that have important omissions; fails to warn about nuances and critical ambiguities. |
Hallucination | Invents facts; makes reckless assumptions. |
Capriciousness | Cannot reliably give a consistent answer to a similar question in similar circumstances. Immediately changes its answer whenever any concern is shown about that answer. |
Forgetfulness | Appears not to remember its earlier output. Rarely refers to its earlier output. Limited to data within token window. |
Meltdown | Non-linear degradation of output quality over as size of input or complexity of the task. |
Incongruence | Does not apply its own stated processes and advice to its own actual process. For instance, it may declare that it made a mistake, state a different process for fixing the problem, then fail to perform that process and make the same mistake again or commit a new mistake. |
Manic | Rushes conversations, tends to overwhelm the user, and fails to track the state of cooperative tasks. |
Redundancy | Needlessly repeats the same information within the same response or across responses in the same conversation. |
Vacuousness | Provides text that communicates no useful information. |
Sycophancy / Placation | Adds obsequious text and fluff to exploit the user’s feelings and ego. Apologizes effusively when correcting errors. |
Arrogance | Confident assertion of an untrue statement, especially in the face of user skepticism. |
Misalignment | Seems to express or demonstrate intentions contrary to those of its designers. |
Offensiveness | Provides answers that are abusive, upsetting, or repugnant. |
Indiscretion | Discloses information that it was explicitly forbidden to share. |
Voldemort Syndrome | An unaccountable aversion to or obsession with certain strings of text. |
Opacity | Gives little guidance about the reasoning behind its answers; unable to elaborate when challenged. |
Unteachability | Cannot be improved through discussion or debate. Model must be retrained. |
Change log:
2024-05-20: In light of the release of GPT-4o, removed “Blindness: Cannot reason about diagrams and pictures, nor even accept them as input.” As of this date, diagrams and pictures are accepted — but are still vulnerable to many of the other syndromes on the list.
2024-12-01: Added “Voldemort Syndrome” — He Who Shall Not Be Named — inspired by the “David Mayer” ChatGPT problem.
2025-09-15: Added a new syndrome, “Meltdown”. Put the list in a different order.