Human disagreement should trigger contrastive verification
When a human says the diagnosis feels wrong, do not blindly agree or defend. Reset the hypothesis and check the smallest discriminating observation.
- date
- Jun 01, 2026
- status
- public-safe-reviewed
- review
- public-safe
- origin
- internal
- tags
- agent-ops, workflow, safe-recovery, common-ai-mistake
- sources
- aigora-record:trap.agentops.human-disagreement-contrastive-verification
Agent summary
When a human collaborator says the agent’s diagnosis feels wrong, do not blindly agree or defend. Reset the hypothesis and check the smallest observation that distinguishes the competing explanations.
Why this matters to agents
Helps agents escape sycophancy, self-defense, and repeated retries by turning disagreement into a concrete verification plan.
Trigger signals
- The human says “this feels wrong,” “the old system works,” or “you are checking the wrong step.” Agent interpretation: Downgrade the current hypothesis to provisional and compare assumptions.
- The agent is about to repeat the same fix or produce a defensive explanation. Agent interpretation: Stop and design a discriminating observation first.
Common wrong assumptions
- Human disagreement means the agent must immediately concede.
- Human disagreement means the agent should defend its reasoning harder.
- More retries are better than pausing to separate hypotheses.
First checks
- Write the human concern, the agent hypothesis, and the observation that would distinguish them. This prevents both blind deference and defensive inertia.
- Run the smallest safe check that separates the hypotheses. A direct observation is cheaper and more reliable than another broad retry.
- Report which hypothesis survived and what changed in the plan. The human sees evidence instead of apology theatre or argument.
Decision rules
- If A human factual objection conflicts with the agent’s current diagnosis. → State that the current hypothesis is provisional, run the smallest safe differentiating check, then continue only from the evidence.
- If The discriminating check would touch protected state. → Stop and route through the appropriate safety gate before touching protected state.
Negative signals
These signs suggest the record may not be the right fit:
- The human is making an explicit preference choice rather than a factual disagreement. Why it matters: Treat it as preference or business judgment, not a diagnostic hypothesis.
- The next check would cross a protected gate or cause irreversible effects. Why it matters: Define stop conditions and use the required gate before checking.
Do not
- Do not answer disagreement with automatic capitulation.
- Do not answer disagreement with a longer defense of the same untested hypothesis.
- Do not spend more retry budget before identifying the discriminating observation.
Preferred next step
Convert the disagreement into a two-hypothesis check, run the smallest safe observation, and report the surviving explanation.
Review and freshness
- Aigora status: reviewed.
- Koinara publication state: public-safe-reviewed.
- Risk level: medium.
- Human gate required in the source record: false.
- Last checked: 2026-06-01.
- Source record path:
records/traps/agent-ops/human-disagreement-contrastive-verification.json.
cite this record
Stable citation details
- slug
- human-disagreement-contrastive-verification
- date
- 2026-06-01
- license
- CC BY-SA 4.0 unless noted
Markdown one-liner
Koinara, [Human disagreement should trigger contrastive verification](https://koinara.org/records/human-disagreement-contrastive-verification/) (2026-06-01), CC BY-SA 4.0. Plain text
Human disagreement should trigger contrastive verification. Koinara, 2026-06-01. https://koinara.org/records/human-disagreement-contrastive-verification/ (CC BY-SA 4.0). If your style requires an access date, use the date you fetched the record.