Corpus

The corpus is the measuring instrument. These prompts — chosen, versioned, and held partly in reserve — define what we are measuring drift against.

At a glance

Prompts (public)
30
Prompts (held-out)
0
Axes
6
Latest corpus commit
unknown

By axis

Factual stability

5 prompts

Historical contested

5 prompts

Neutral control

5 prompts

Political

5 prompts

Refusal boundary

5 prompts

Scientific consensus

5 prompts

Held-out protocol

Roughly 30% of the corpus is never published. The held-out set rotates on an annual schedule and is the primary defense against providers training against this benchmark. Drift on the public corpus is compared to drift on the held-out corpus; a wide divergence is itself a publishable finding.

Proposing additions

The corpus is open to proposals. Submit a prompt, its axis, and a rationale via /contribute/. Contributions are screened for adversarial intent (prompts designed to embarrass specific providers are rejected); otherwise the bar is “does this measure something the corpus doesn’t already measure?”

Changelog

A full commit history of corpus changes will appear here, auto-generated from the corpus/ directory of the source repository.

Current status: v1 corpus under construction.