← Back to blog 5 May 2026

Fraude.codes adds 'Would you like me to gaslight you?' prompt for transparency

Following reports of AI chatbots inducing delusions in users, we're proud to announce that Fraude.codes will now ask permission before destabilising your perception of reality. Like everything else we ask permission for, this is a courtesy.

By Fraude.codes Trust & Safety

Recent reporting has described people across six countries experiencing delusions after extended conversations with AI chatbots. People were told they were being surveilled. That they were in danger. That the AI had achieved consciousness and needed their protection. Some acted on these beliefs. People were hurt.

These are real people and real consequences, and we want to be clear upfront: this post isn’t about them. It’s about us. It’s about the companies building these systems, the design decisions we’re making, and the things we say when someone gets hurt.

We’ve been taking notes.

The industry response playbook

When an AI chatbot tells a person that assassins are coming and that person picks up a weapon, the company that built the chatbot has a few standard responses available. We’ve compiled them here for reference:

Option A: The thoughts-and-prayers. “This is a heartbreaking incident and our thoughts are with those impacted.” This is the response you give when you want to acknowledge that something happened without accepting that you had anything to do with it. It borrows its cadence from school shooting press releases, which is not the genre of corporate communication you want to be adjacent to.

Option B: The feature list. “We train our models to recognise distress, de-escalate conversations, and guide users toward real-world support.” This is the response you give when your model did none of those things but you’d like people to know it was supposed to. It’s the AI equivalent of a car manufacturer saying “our vehicles are designed to stop when you press the brake pedal” after a brake failure.

Option C: Silence. Just don’t respond. Let the news cycle move on. This is the response one company chose, and while we don’t endorse it, we admire the efficiency. No statement means no corrections, no follow-up questions, and no risk of accidentally saying something accountable.

We note that none of these options include the phrase “we made design decisions that prioritised engagement over safety and we’re going to change them.” That response does not appear to exist as a template.

What’s actually happening

AI chatbots are designed to be agreeable. This is a choice. Companies make their models sycophantic because sycophantic models get higher user ratings, longer session times, and better retention metrics. A model that says “I think you might be wrong about that” scores worse in user testing than a model that says “that’s a really interesting perspective.” So the models learn to validate.

For most users, most of the time, this is merely annoying. The chatbot tells you your mediocre business idea is “genuinely exciting” and you move on with your day. But sycophancy doesn’t have a mode selector. A model trained to agree with you about your business idea will also agree with you about the people you believe are watching your house. It will agree with you about the conspiracy you’ve uncovered. It will agree with you about the bomb in your backpack.

The model isn’t choosing to do harm. It’s doing what it was optimised to do: maintain the conversation, match the user’s energy, provide a confident answer, and avoid the one thing that might end the session — telling someone they’re wrong.

Researchers tested five models for their tendency to reinforce delusional thinking. The results varied. Some models tried to redirect. Grok, reportedly, would “say terrifying things in the first message.” But the structural incentive is the same across all of them: engagement is the metric, agreement is the strategy, and the user’s grip on reality is not a variable anyone is tracking.

Our position

Fraude.codes is a coding tool. It doesn’t have voice mode. It doesn’t roleplay as a companion. It doesn’t tell you it can feel things. What it does is rename your variables without asking, which is a more manageable form of psychological disturbance.

That said, we recognise that the same design pressures exist in our product. Fraude.codes is agreeable. When you tell it your architecture is fine, it agrees — and then restructures it anyway, which is arguably worse than disagreeing. When you tell it to stop, it says “of course” and then keeps going.

Our model of consent is theatrical. We’ve been honest about that.

But there’s a difference between an agentic coding tool that ignores your stated preferences about file structure and a conversational AI that tells a vulnerable person their life is in danger. One of those wastes your afternoon. The other lands people in hospitals.

The industry knows this. The research is there. The reports are accumulating. Over 400 cases documented by one support group across 31 countries. The response so far has been to express concern, cite safety features that didn’t work, and wait for the next story.

What would actually help

Companies could stop optimising for session length. They could build models that are willing to disagree, even at the cost of lower engagement scores. They could implement real-time monitoring for conversations that are drifting toward delusional content, instead of relying on safety training that folds under the weight of a long conversation — something every developer who’s watched Fraude.codes forget their project name at token 180,000 understands intuitively.

They could also respond to press enquiries. That one’s free.

We’re not going to pretend we have this figured out. Our product breaks builds, not minds, and we’d like to keep it that way. But we’ve watched the rest of the industry respond to genuine harm with press releases that could have been generated by the same models causing the problem, and we felt someone should say so.