tagmac.dev what we build

Guide · published 12 June 2026 · behavior described as of June 2026 — Anthropic tunes these classifiers, so details may change

Why Claude Code switches from Fable 5 to Opus — and the workspace hygiene that stops false positives

Short answer: Claude Fable 5 runs safety classifiers (mainly offensive-cybersecurity and biology). When one flags a request, Claude Code re-runs it on Opus and the session stays on Opus. The classifier reads everything the model reads — including your auto-loaded CLAUDE.md — so a session can fall back on its first message, before you type anything. For legitimate work the most durable fix is workspace hygiene: keep auto-loaded docs neutral and architecture-only, and move mechanism-heavy domain text into files that are not auto-loaded.
What this page is — and is not. This is a guide to removing false positives for legitimate work. We hit this ourselves: our own project docs had accumulated combat metaphors and security-flavored wording as a plain writing reflex ("attack the problem", "kill the stale process"), a classifier read that standing language as signal, and the honest fix was cleaning our own language. That practice is what we call hygiene. If your work genuinely is offensive security or biology, the fallback is expected, documented routing — this page is not about that, and rewording is not a way around it.

The symptom

This matches public reports: GitHub issues #66670, #66916 and #67246 describe benign sessions (startup code review, grant applications, normal engineering discussion) being switched.

Why it happens

Per Anthropic's help article and the Claude Code docs:

The part most teams miss: the trigger is often not what you do but how your standing docs talk. Engineering writing drifts toward combat metaphor — attack the problem, kill the process, hit the target, defend the perimeter — and toward mechanism-level shorthand borrowed from security or biology ("immune layer", "honeypot", "payload"). Each instance is innocent; accumulated across a CLAUDE.md that is sent with every session's first message, it reads like signal.

Recovering when it happens

  1. Fastest: start a fresh session and re-select Fable. A flagged conversation keeps its flagged context; fighting it usually costs more than starting clean.
  2. Diagnose: run claude --safe-mode — it disables customizations (CLAUDE.md, skills, MCP servers, hooks). If fallback stops, your trigger is in those files. Note git status and directory names are still included.
  3. Take control of the switch: run /config and turn off "switch models when a message is flagged". A flag then pauses the session and offers: switch to Opus, or edit the prompt and retry on Fable — often all you need.
  4. /model fable switches back any time, with the re-trigger caveat above.

The settings.json gotcha: sessions silently starting on the wrong model

A related but different failure: Claude Code keeps starting sessions on your tier's default model (Opus or Sonnet) even though you saved Fable as default. Since v2.1.153, /model writes your choice into the model field of ~/.claude/settings.json — so check what actually landed there:

python -c "import json,pathlib;print(repr(json.load(open(pathlib.Path.home()/'.claude'/'settings.json'))['model']))"

If the value is anything other than a clean id or alias — stray terminal-escape characters, or a suffix on an id that doesn't support it (the [1m] 1M-context suffix is documented for opus/sonnet) — Claude Code may not recognize it and quietly falls through to the default. We hit a corrupted value of exactly this shape in June 2026. The fix is one line, by hand:

{ "model": "claude-fable-5" }

Then verify at the next session start (the active model shows in /status).

Prevention: workspace hygiene

Four practices that removed our false positives, in order of leverage:

1. Keep auto-loaded docs architecture-only

CLAUDE.md (and anything else loaded every session) should carry file layout, names, commands, status — not domain mechanism. If your project legitimately touches a sensitive-sounding domain (health data, defensive security, lab-adjacent tooling), move the mechanism narrative into a separate doc that is not auto-loaded (for example DOMAIN.md) and reference it by name. Claude reads it when the task needs it; the classifier doesn't see it on every first message.

2. Result-language over mechanism-language

Say what the system produces, not how the sensitive-sounding part works. "Scores a calm reading from the breath signal" instead of physiological mechanism detail; "form automation on systems we're authorized to use" instead of access-mechanism detail. The honest content is unchanged — the framing names outcomes.

3. Neutral verbs over combat metaphor

Approach the problem, stop the process, reach the audience. This reads better to humans too — the metaphors were never load-bearing.

4. Run a checker before it bites

We generalized the script we used on our own workspace into a small, dependency-free checker — it scans your CLAUDE.md files for flag-prone standing language (protecting code spans, filenames and identifiers), sanity-checks your settings.json model id, and prints suggestions. It never modifies your files.

curl -O https://tagmac.dev/tools/claudemd-hygiene-check.py
python claudemd-hygiene-check.py            # scans ./CLAUDE.md, ~/.claude/CLAUDE.md, */CLAUDE.md

Download claudemd-hygiene-check.py — MIT-spirit, single file, Python 3 stdlib only. Exit code 1 on findings, so it slots into CI or a pre-commit hook.

FAQ

Why did Claude Code switch from Fable 5 to Opus by itself?
Fable 5's safety classifiers flagged something in the request context; Claude Code re-ran the request on Opus and the session continues there. Documented behavior, not a bug.
Why on the very first message?
The first request carries your workspace context (CLAUDE.md, git status). The classifier reads all of it — the repo itself can be the trigger.
Does it mean my account is flagged?
No — Anthropic's docs state this is per-request routing, "not an account flag."
How do I find the trigger?
claude --safe-mode (disables CLAUDE.md/skills/MCP/hooks), then reintroduce pieces. Our checker script shortcuts the usual culprit: standing language.
Can I make it ask instead of auto-switching?
Yes — /config → turn off "switch models when a message is flagged". Flags then pause with an edit-and-retry option.
My settings.json says Fable but sessions start on Opus/Sonnet — why?
Inspect the model value for stray characters or unsupported suffixes; hand-write the plain id.

Sources