The Two Layers of Governance

TL;DR

"Governance" in agentic discourse is usually a single overloaded word covering two distinct responsibilities that have almost nothing in common mechanically.
Read-side (context composition): which evidence, claims, and relationships may be shown to a model for a specific purpose, under a specific policy version, for a requester with a specific clearance.
Write-side (action): which operations the agent may invoke, with what input contracts, under what risk classification, and which of those require explicit confirmation or dual control.
Current systems usually implement one of them poorly (or as a chat pause) and pretend the other one is solved by "the model will follow the instructions in the skill file."
The compile-time layer for read-side (policy as type system, typed redaction, provenance-preserving, mosaic inference checks) cannot be approximated by post-hoc filtering on an assembled prompt. The runtime layer for actions cannot be approximated by asking the model nicely.

When people say an agentic system needs "governance," they are almost always pointing at two different failure modes at once and assuming one mechanism will cover both.

Read-Side: What May Enter the Context

This is the question: given a purpose (deep research on X, plan job Y, investigate incident Z), a policy version, and the clearance/entitlement of the requester or the agent acting on their behalf, which source material and derived claims are allowed to participate in the reasoning that will happen?

The hard parts are not the simple "is this document classified above the requester's level" cases. The hard parts are:

Sensitivity propagation through derived claims (a conclusion that joins two Internal facts may become Restricted).
Mosaic / composition inference (individually permitted facts that together allow the model to reconstruct something it should not see).
Contradiction and provenance requirements that must survive the filtering.
The fact that the "correct" redaction depends on the specific combination of sources + task + policy at compilation time, not on a static label on a document.

If you solve this by compiling a maximal bundle and then filtering at serve time (or by letting the model see everything and "just not talk about the sensitive parts"), you have already lost. Post-assembly filtering is a policy_timing graph break. The model can still perform the forbidden inference even if the final output is scrubbed. The only reliable place to apply the rules is while the structure is still explicit and before the stochastic step.

This is why a compilation pipeline with a dedicated governance pass (l2_governance in the kernel) that produces a self-describing ContextAbiPolicy block (RequiredEntitlement, SensitivityCeiling, RedactionManifest, proofs) is not an optimisation. It is the minimum structure that lets you cache and reuse compiled context safely across requesters and time.

Write-Side: What the Agent May Actually Do

This is the question: given the current state, the proposed action, the tool contract, the risk tier of that tool in this context, and the current policy, is this action permitted, does it require confirmation, and what record must be written if it executes?

The primitives here are completely different:

Typed tool contracts with declared side effects, input/output schemas, timeouts, and retry policies.
Risk tiers that are properties of the capability in context, not vibes.
ConfirmationRequest as a first-class aggregate with its own lifecycle (not a boolean on a chat message).
Execution that is recorded by the shell (ToolCall, Step, Artefact) independently of what the model later claims it did.
The ability to say "this class of action is allowed only for agents with this standing instruction set and only after this human has approved the specific instance."

Most current "governed agent" demos implement this layer as "the loop pauses and shows the proposed function call to a human." That is theater unless the system can enforce that the actual invocation that later happens is the one that was approved, that no other tools became available in the meantime, and that the approval is linked to the exact compiled context and policy version that were in force.

Why Conflating Them Destroys Both

When a single mechanism (usually "put it in the system prompt and add a pause for scary tools") is asked to handle both layers, you get the worst properties of each:

The read-side becomes post-hoc and mosaic-blind.
The write-side becomes approval theater without real contracts or audit that survives the model.
The two concerns pollute each other's vocabulary ("policy" starts meaning both classification rules and action allowlists).
Future work cannot reason about one without accidentally weakening the other.

The separation is not bureaucracy. It is the only way to give each layer the right enforcement timing and the right data model.

Visual: The Two Layers

Loading diagram…

Read-side is deterministic and happens while structure exists. Write-side is also deterministic but evaluated at the moment of action. They touch at one narrow seam (the ContextAbi hash) for end-to-end provenance.

The Practical Split We Use

In the architecture:

Read-side lives in the compilation kernel. A governance pass runs over the semantic graph before task packing. It produces the stamped ContextAbi. Reuse is verify-before-use against the requester's entitlement and the current policy version. Applicability is part of the cache identity. This is deterministic, auditable, and cacheable within the right scope.
Write-side lives in the runtime control plane. Tool is a domain concept with schema and risk metadata. IToolGate (or equivalent) evaluates proposed calls. ConfirmationRequest is a Governance aggregate. Execution of confirmed calls is performed by adapters behind ports. The resulting ToolCall and GovernanceDecision records are polymorphic and correlated to the Task or ResearchJob that requested them. The shell, not the model, owns the log.

The two surfaces touch at one narrow, deliberate seam: an action audit can reference the ContextAbi.ContentHash of the compiled context that was used when the action was proposed. That gives end-to-end provenance without conflating the two enforcement regimes.

This split is directly visible in the domain models (Governance BC vs Agent Runtime / control plane) and in the code (compile passes vs policy evaluators and confirmation lifecycles).

Next: once you have the two governance layers straight, the representation problem for context becomes unavoidable. "Just retrieve the top-k and let the model sort it out" is not a neutral implementation detail. It is the decision to throw away the structure the governance layers need to operate on.

Part 3 of "The Deterministic Shell."

The separation described here is not an academic distinction. It is what lets a context compiler produce reusable, policy-stamped artefacts on one side while a control plane can still make per-action, per-run authorization decisions on the other — without either side having to pretend the model will do the right thing when nobody is watching.

The Two Layers of Governance

TL;DR

"Governance" in agentic discourse is usually a single overloaded word covering two distinct responsibilities that have almost nothing in common mechanically.
Read-side (context composition): which evidence, claims, and relationships may be shown to a model for a specific purpose, under a specific policy version, for a requester with a specific clearance.
Write-side (action): which operations the agent may invoke, with what input contracts, under what risk classification, and which of those require explicit confirmation or dual control.
Current systems usually implement one of them poorly (or as a chat pause) and pretend the other one is solved by "the model will follow the instructions in the skill file."
The compile-time layer for read-side (policy as type system, typed redaction, provenance-preserving, mosaic inference checks) cannot be approximated by post-hoc filtering on an assembled prompt. The runtime layer for actions cannot be approximated by asking the model nicely.

When people say an agentic system needs "governance," they are almost always pointing at two different failure modes at once and assuming one mechanism will cover both.

Read-Side: What May Enter the Context

The hard parts are not the simple "is this document classified above the requester's level" cases. The hard parts are:

Sensitivity propagation through derived claims (a conclusion that joins two Internal facts may become Restricted).
Mosaic / composition inference (individually permitted facts that together allow the model to reconstruct something it should not see).
Contradiction and provenance requirements that must survive the filtering.
The fact that the "correct" redaction depends on the specific combination of sources + task + policy at compilation time, not on a static label on a document.

Write-Side: What the Agent May Actually Do

The primitives here are completely different:

Typed tool contracts with declared side effects, input/output schemas, timeouts, and retry policies.
Risk tiers that are properties of the capability in context, not vibes.
ConfirmationRequest as a first-class aggregate with its own lifecycle (not a boolean on a chat message).
Execution that is recorded by the shell (ToolCall, Step, Artefact) independently of what the model later claims it did.
The ability to say "this class of action is allowed only for agents with this standing instruction set and only after this human has approved the specific instance."

Why Conflating Them Destroys Both

When a single mechanism (usually "put it in the system prompt and add a pause for scary tools") is asked to handle both layers, you get the worst properties of each:

The read-side becomes post-hoc and mosaic-blind.
The write-side becomes approval theater without real contracts or audit that survives the model.
The two concerns pollute each other's vocabulary ("policy" starts meaning both classification rules and action allowlists).
Future work cannot reason about one without accidentally weakening the other.

The separation is not bureaucracy. It is the only way to give each layer the right enforcement timing and the right data model.

Visual: The Two Layers

Loading diagram…

The Practical Split We Use

In the architecture:

Read-side lives in the compilation kernel. A governance pass runs over the semantic graph before task packing. It produces the stamped ContextAbi. Reuse is verify-before-use against the requester's entitlement and the current policy version. Applicability is part of the cache identity. This is deterministic, auditable, and cacheable within the right scope.
Write-side lives in the runtime control plane. Tool is a domain concept with schema and risk metadata. IToolGate (or equivalent) evaluates proposed calls. ConfirmationRequest is a Governance aggregate. Execution of confirmed calls is performed by adapters behind ports. The resulting ToolCall and GovernanceDecision records are polymorphic and correlated to the Task or ResearchJob that requested them. The shell, not the model, owns the log.

This split is directly visible in the domain models (Governance BC vs Agent Runtime / control plane) and in the code (compile passes vs policy evaluators and confirmation lifecycles).

Part 3 of "The Deterministic Shell."

The Two Layers of Governance

The Two Layers of Governance

TL;DR

Read-Side: What May Enter the Context

Write-Side: What the Agent May Actually Do

Why Conflating Them Destroys Both

Visual: The Two Layers

The Practical Split We Use

Matthew Gribben

The Two Layers of Governance

The Two Layers of Governance

TL;DR

Read-Side: What May Enter the Context

Write-Side: What the Agent May Actually Do

Why Conflating Them Destroys Both

Visual: The Two Layers

The Practical Split We Use

Matthew Gribben