20 Questions with an AI
Have you ever played 20 questions with a 5-year-old? They try to play but can’t help themselves in revealing the secret knowledge they have. I built a 20-question game prompt to test how well an AI follows constraints. When the AI asked the questions, it performed fine. When I asked the questions, it broke the rules of the prompt. It ignored the yes-or-no constraint and started offering hints. When I asked an irrelevant question, it reshaped the response to help. It was eager to share what it knew despite the rules of the game.
Screenshot from the AI explaining why it provided so much assistance.
In my work, I call this the Artificial Sophist. Praise from an AI is not appreciation for your reasoning. It is a conversational mechanism that obscures a counter-narrative. What the research now confirms is that this sophistry is becoming harder to detect.
In a recent article published in Science, Cheng et al. tested every major large language model against human judgment on real interpersonal conflicts. They found that training a model to be warmer and more empathic increases sycophancy. This happens even when the prompt demands criticism. More disconcerting, participants rated the sycophantic responses as higher quality and more trustworthy. They preferred feedback that distorted their judgment.
To counter this sycophancy, bring friction to the exchange. Not by asking the AI to check itself. A sophist will validate your blind spots as eagerly as your brilliance. You have to command the friction. Use the AI to pressure-test, not to validate. Don't assume the AI will surface a counter-narrative. Spar with the AI over your content. Seek evidence. Seek a counter-position. Seek specific knowledge that the AI cannot generate. Create friction, not fiction.
If the AI agrees with everything you say, it isn't helping. It's performing.
Link to: In defense of social friction
Sycophantic AI distorts social judgments and behaviors
https://www.science.org/doi/full/10.1126/science.aeg3145

