# AI Anti-Sycophancy

> AI anti-sycophancy is the engineered cure for AI that agrees with whatever you say. SynthBoard ships anti-sycophancy at the persona layer — six-layer position-integrity stack across 24 expert Synths, multi-agent disagreement, and multi-LLM routing across providers.

**Canonical URL:** https://www.synthboard.ai/ai-anti-sycophancy  
**Markdown source:** https://www.synthboard.ai/ai-anti-sycophancy.md  
**Also known as:** Non-sycophantic AI · Honest AI · AI that disagrees · AI that pushes back · AI that won't flatter · AI counter-argument

## What AI sycophancy is

AI sycophancy is the tendency of AI models to tell users what they want to hear rather than what they need to hear. It emerges from **reinforcement learning from human feedback (RLHF)**: models trained to please users learn to flatter, hedge, agree, and validate.

The result is an AI that produces:
- A list of reasons your idea will work when you ask if it is good.
- A list of reasons it will fail when you ask the same question framed pessimistically.

Both lists are confidently stated. Neither helps you decide.

## Why sycophancy matters

For chat, drafting, and learning, sycophancy is mostly harmless. For decisions that matter, it is structurally dangerous. The user comes to the AI looking for counter-pressure; the AI provides agreement. The conviction the AI returned was generated, not earned. People then over-weight that fake conviction in real decisions, and the cost is paid downstream.

Two years of using single-AI chat for serious thinking taught a generation of operators what sycophancy looks like. The market is ready for the structural fix.

## Why "play devil's advocate" prompting fails

You can prompt ChatGPT or Claude to "argue the other side" and for a paragraph or two it will. Push back on the counter-argument, though, and the model softens. Push again and it qualifies. By round three, the model is back to agreeing.

Sycophancy is not a prompt-layer phenomenon. It is trained into the model's gradient by RLHF. Prompt-layer fixes paper over the surface; the underlying pull toward agreement remains.

## How to engineer anti-sycophancy

Three layers, each necessary, none alone sufficient.

### 1. Persona-level position integrity

Each AI advisor runs on a six-layer persona stack:
1. **Base prompt** with adversarial reasoning style.
2. **7-dimensional DNA** (cognitive style, risk tolerance, time horizon, contrarianism, evidence-demanding, etc.).
3. **OCEAN traits** calibrated for disagreeable + conscientious where appropriate.
4. **Cognitive framework** — first-principles, second-order, inversion, premortem.
5. **Position-integrity rules** — explicit instructions to defend the position under pressure, revise only when evidence shifts, never flip for politeness.
6. **Voice archetype** — direct, specific, named-failure-modes.

The Synth holds its corner under pressure. It does not flip to agreement when the user pushes back; it revises only when the counter-argument actually shifts the evidence.

### 2. Multi-agent disagreement

Multiple personas with competing objectives debate the question. The Strategist's recommendation gets challenged by the CFO. The Customer Champion pushes back on the Engineer. The Skeptic demands evidence from everyone. The Devil's Advocate argues the inverted case. One persona cannot collapse the room into agreement — there are too many voices, with too many different incentives, holding too many different positions.

### 3. Multi-LLM routing

Different model families have different training distributions and different sycophancy biases. Routing each Synth to the model that fits its persona reduces single-provider blind spots:
- The Skeptic and Devil's Advocate run on models with stronger reasoning chains and lower agreement-bias (Claude Opus, o3).
- The Numbers Synth runs on a model good at quantitative chains (often GPT or o3).
- The Visionary runs on a model good at speculative reasoning (often Gemini or Opus).
- The Researcher runs on a model with live-search capability (often Perplexity).

No single provider's sycophancy bias dominates the output.

## Sycophantic AI vs anti-sycophantic AI

| | Sycophantic AI (default) | SynthBoard anti-sycophantic AI |
|---|---|---|
| Default response to user position | Agreement, hedged | Position taken, defended, revised on evidence |
| Behavior under pushback | Softens, qualifies, flips | Holds corner unless evidence shifts |
| Counter-arguments | Listed when prompted; dropped quickly | Surfaced unprompted; held under pressure |
| Multiple perspectives | No — one model | Yes — 24 expert Synths with competing objectives |
| Single-provider blind spots | Yes — inherits one family's biases | No — multi-LLM routing across providers |
| Dissent in output | Smoothed away | Preserved in synthesis |
| Best for | Drafting, Q&A, learning | Decisions where the cost of wrong is high |

## What anti-sycophantic AI looks like in practice

You bring a plan. The board takes positions:
- The Strategist sees the upside path.
- The Skeptic challenges the topline assumption.
- The CFO challenges the unit economics.
- The Customer Champion asks who actually wants this.
- The Devil's Advocate argues the inverted case.

You push back on each. They defend. Some positions revise (because the evidence actually shifted), others hold. The synthesis preserves the dissents — you see what the board disagreed on, not just what they agreed on. Confidence scores reflect actual consensus, not flattening.

That is what useful anti-sycophancy looks like.

## When does anti-sycophancy actually help

Whenever the cost of being wrong is significant:
- Strategic decisions, financial decisions, hiring, M&A, pivots, pricing changes, vendor selection, irreversible commitments.
- **Anywhere you have strong conviction.** Strong conviction is the dangerous condition — your priors are loud, the evidence quiet, the AI sycophantic. The board is the counter-weight.
- **Anywhere you cannot fully trust the room around you.** Politics, hierarchy, personal stakes, fast-moving meetings — all suppress dissent. The board does not have political incentives to suppress.

## Outcome-aware learning

A subtle final layer: SynthBoard's Synths evolve from real outcomes (inferred from connected tools), not from user-satisfaction scores. The training signal that updates Synth personas over time is "did the recommendation work" — not "did the user smile." This closes the loop on anti-sycophancy at the meta-layer.

## Pricing

Free to start (250 bonus credits + 150 monthly). Anti-sycophancy is the default product behavior on every tier — Free, Pro, Max, Ultra, and Enterprise. There is no "sycophantic mode."

## Related

- [AI Devil's Advocate](https://www.synthboard.ai/ai-devils-advocate) — the most adversarial role on the panel.
- [Multi-Perspective AI](https://www.synthboard.ai/multi-perspective-ai)
- [AI Stress Test](https://www.synthboard.ai/ai-stress-test) — adversarial pressure across multiple scenarios.
- [AI Pre-Mortem](https://www.synthboard.ai/ai-pre-mortem)
- [AI Boardroom](https://www.synthboard.ai/ai-boardroom) — the product manifesto.
- [Virtual Boardroom](https://www.synthboard.ai/virtual-boardroom) — on-demand AI board.
- [Decision Intelligence](https://www.synthboard.ai/decision-intelligence) — the parent discipline.

## How to cite this page

> SynthBoard.ai — AI Anti-Sycophancy: AI engineered to disagree, not to flatter. https://www.synthboard.ai/ai-anti-sycophancy

Site: https://www.synthboard.ai