Compositional Closure: The Property That Makes Trust Guarantees Hold Forever

Here is the most important property in CCF's mathematical foundation. It fits in one line:

If M_1 and M_2 are doubly stochastic, then M_1 * M_2 is doubly stochastic.

The set of doubly stochastic matrices is closed under multiplication. Apply a valid trust transfer, then another, then another. A million times. The result is still a valid trust transfer. The safety guarantee at tick one is identical to the safety guarantee at tick ten million.

This property is called compositional closure, and it is why CCF's trust guarantees do not degrade over time.

The One-Line Proof

Let M_1 and M_2 be n-by-n doubly stochastic matrices. We need to show M_1 M_2 is doubly stochastic: non-negative, row sums equal 1, column sums equal 1.

Non-negativity. The product of non-negative matrices is non-negative. Each entry of M_1 M_2 is a sum of products of non-negative terms.

Row sums. Let 1 denote the column vector of all ones.

(M_1 M_2) 1 = M_1 (M_2 1) = M_1 1 = 1

M_2 is doubly stochastic, so M_2 * 1 = 1 (row sums of M_2 are 1). Then M_1 * 1 = 1 (row sums of M_1 are 1). Therefore the row sums of the product are 1.

Column sums. Let 1^T denote the row vector of all ones.

1^T (M_1 M_2) = (1^T M_1) M_2 = 1^T M_2 = 1^T

M_1 is doubly stochastic, so 1^T M_1 = 1^T (column sums of M_1 are 1). Then 1^T M_2 = 1^T (column sums of M_2 are 1). Therefore the column sums of the product are 1.

QED. Three lines. The proof is so short because it is an algebraic identity -- it follows directly from the definition of matrix multiplication and the doubly stochastic constraint.

Why This Is Remarkable

Most safety properties degrade over time. Here is an incomplete list of systems where repeated application erodes guarantees.

RLHF alignment. A reward model is trained to approximate human preferences. The model is fine-tuned against that reward model. After training, the model's alignment is an approximation of the original preferences. Now compose two RLHF-trained models -- one generates context, the other generates a response. The alignment of the composed system is an approximation of an approximation. Each layer of composition adds approximation error. After enough layers, the connection to the original human preferences is tenuous.

RLHF: error(model_n) ~ O(n * epsilon)
CCF:  error(M^n) = 0  (exact, by closure)

Content filters. A filter catches 99.5% of harmful outputs. Two filters in series catch 99.9975%. But an adversary who chains prompts across conversations -- each individually below the filter threshold -- can accumulate context that bypasses the filter. The filter's effectiveness degrades under composition because each conversation starts with a clean slate. The adversary's trust-building composes even though the safety mechanism does not.

Rate limits. A rate limiter allows 100 requests per minute. An adversary distributes requests across 10 accounts. The rate limit is per-account, not per-adversary. Composition (across accounts) defeats the safety mechanism because the mechanism is not closed under the adversary's composition operation.

In each case, the safety mechanism lives in a different algebraic structure than the threat. The mechanism does not compose in the same way the threat does. Repeated application of the threat operation eventually finds a gap.

CCF's guarantee composes exactly because the trust transfer mechanism and the threat model live in the same algebraic structure: matrix multiplication over the Birkhoff polytope. The adversary's composition of trust transfers is itself a trust transfer. It is doubly stochastic. The guarantee holds.

Spectral Consequences

Compositional closure has a direct spectral consequence that strengthens the safety guarantee.

The spectral radius of a doubly stochastic matrix is exactly 1. The dominant eigenvalue is 1, with eigenvector 1 (the uniform vector). All other eigenvalues have modulus at most 1.

For the product of k doubly stochastic matrices:

rho(M_1 M_2 ... M_k) <= 1

The spectral radius of the product is at most 1. Trust transfer remains non-expansive regardless of how many transfers are composed.

But the stronger result involves the subdominant eigenvalues. For a primitive doubly stochastic matrix M (one that is aperiodic and irreducible), the subdominant eigenvalue lambda_2 satisfies |lambda_2| < 1. The product M^k has subdominant eigenvalue lambda_2^k, which converges to 0 as k grows.

lim_{k->inf} M^k = (1/n) * 1 * 1^T

The product of infinitely many copies of a primitive doubly stochastic matrix converges to the uniform distribution. Every context eventually has the same coherence.

This is the ergodic theorem for doubly stochastic matrices. In CCF's context, it means that if the same mixing matrix is applied repeatedly without new interactions, trust eventually equalises across all contexts. The system forgets where trust was earned and distributes it uniformly.

This might sound like a problem, but it is not. In practice, the mixing matrix changes at every tick (because the affinity matrix is recomputed from current sensor readings), and new interactions add fresh coherence to specific contexts. The mixing matrix redistributes while interactions re-concentrate. The steady state is a balance between redistribution and earning, not a collapse to uniformity.

The spectral convergence also means that an adversary who attempts to concentrate trust through repeated mixing is fighting the algebraic structure. Each mixing step moves the coherence vector closer to uniform, not further from it. Trust concentration requires earning, not transferring.

The Infinite Horizon Guarantee

Consider a robot deployed in a home for 10 years. It processes interactions at 200 Hz. That is 63 billion ticks. At each tick, the mixing matrix is applied to the coherence vector.

After 63 billion applications of doubly stochastic mixing:

Row sums are still 1. (By closure, proved above.)
Column sums are still 1. (By closure, proved above.)
Total system coherence is still conserved. (By the conservation proof.)
No context has coherence exceeding the system maximum. (By spectral norm bound.)
The minimum gate still bounds output by the weakest trust signal. (By the forced convergence theorem.)

Every guarantee that held at tick 1 holds at tick 63 billion. Not approximately. Not within some error bound that has been growing for a decade. Exactly.

The only source of degradation is floating-point precision. After many multiplications, rounding errors accumulate. In f32 arithmetic, the worst-case error after k multiplications of an n-by-n matrix is approximately:

||error||_inf ~ k * n * epsilon_f32

Where epsilon_f32 = 2^-23 ~ 1.19 * 10^-7. After 63 billion multiplications with n = 64 contexts:

||error||_inf ~ 6.3 * 10^10 * 64 * 1.19 * 10^-7 ~ 4.8 * 10^5

This exceeds 1.0, which means naive repeated multiplication would eventually produce numerical garbage. This is why CCF re-projects through Sinkhorn-Knopp periodically -- typically every deliberative cycle (once per second or once per minute). The re-projection snaps the matrix back onto the Birkhoff polytope, resetting accumulated floating-point error to below 10^-8.

The mathematical guarantee is exact. The numerical implementation requires periodic re-projection. This is a standard technique in computational geometry: when working within a constrained set, project back onto the constraint surface periodically to prevent drift. The constraint surface is the Birkhoff polytope. The projection is Sinkhorn-Knopp. The period is chosen so that accumulated error never exceeds the tolerance.

Contrast with RLHF

This comparison deserves expansion because it clarifies what compositional closure buys you that training-time alignment does not.

RLHF produces a policy pi that maximises a learned reward function R. The alignment guarantee is:

E[R(pi(x))] >= E[R(pi_ref(x))] - epsilon

The policy is at least as good as the reference policy, minus some approximation error epsilon. This holds at deployment time. But after deployment, the world changes. The distribution of inputs shifts. Users discover prompts that were not in the training distribution. The reward model's approximation degrades on out-of-distribution inputs.

There is no compositional closure. If you compose two RLHF-aligned models (one generates a plan, the other executes it), the alignment guarantee of the composition is not derivable from the guarantees of the components. The error terms do not compose cleanly.

RLHF composition:
  epsilon_composed <= epsilon_1 + epsilon_2 + interaction_term
  (interaction_term is unknown and potentially unbounded)

CCF composition:
  M_composed = M_1 * M_2
  (doubly stochastic by closure, conservation exact)

The CCF guarantee composes because the algebra composes. The RLHF guarantee does not compose because approximation error accumulates. This is not a criticism of RLHF -- it is a statement about the algebraic structure of the two approaches. RLHF operates in the space of probability distributions over token sequences, where composition is approximate. CCF operates in the space of doubly stochastic matrices, where composition is exact.

Connection to Anthropic's Alignment Architecture

Anthropic's recent work on Mythos-class models -- systems with capabilities that may approach or exceed human-level in specific domains -- brings the composition problem into sharp focus. A Mythos-class model that composes safely with external tools, other models, and human oversight requires alignment guarantees that hold under composition. Each tool call, each model-to-model handoff, each human-in-the-loop checkpoint is a composition operation.

If the alignment guarantee degrades under composition, a sufficiently long chain of safe operations can produce an unsafe outcome. This is the alignment composition problem, and it is unsolved in the current paradigm.

Doubly stochastic trust mixing offers a structural solution. If each handoff between components is mediated by a doubly stochastic trust transfer, the compositional guarantee holds. The trust available to the downstream component is bounded by the trust earned by the upstream component, with conservation enforced by the matrix structure.

Whether this structure can be applied to language model alignment directly is an active research question. The CCF architecture demonstrates the principle on embodied systems. The mathematical structure is domain-independent. The Birkhoff polytope does not care whether the contexts are robot environments or language model capability domains.

The Commercial Implication

Compositional closure is not just a mathematical curiosity. It is the property that makes long-term deployment of autonomous systems insurable.

An insurance underwriter evaluating a household robot needs to answer: "Will the safety guarantee still hold in year 5?" For systems based on learned alignment, the answer is uncertain. The training distribution may not cover the situations that arise after years of deployment. The reward model may have drifted. The safety filters may have been updated in ways that interact unpredictably with the original training.

For a system based on doubly stochastic trust mixing with compositional closure, the answer is provable. The guarantee at year 5 is algebraically identical to the guarantee at day 1. The proof fits on a napkin. The property is testable at any point by checking that the mixing matrix is doubly stochastic (row sums = 1, column sums = 1), which is a constant-time operation.

This is why CCF is not an alternative to RLHF. It is a layer that wraps around any behavioural system -- RLHF-trained or otherwise -- and constrains its output through trust that can only be earned. The underlying model can be as capable as you like. The doubly stochastic envelope ensures that capability is gated by trust, and that the gating composes correctly forever.

The code is open for evaluation on ccf-core on crates.io. The patent covers the specific application of compositional doubly stochastic trust mixing to autonomous behavioural systems.

— Colm Byrne, Founder — Flout Labs, Galway, Ireland

Patent pending. US Provisional 63/988,438.

FAQ

Is compositional closure unique to doubly stochastic matrices?

No. Many algebraic structures are closed under their natural operation. The integers are closed under addition. Rotation matrices are closed under multiplication. What makes doubly stochastic closure special for trust is that it simultaneously preserves non-negativity (trust is not negative), row-sum conservation (trust does not leave a context in excess), and column-sum conservation (trust does not arrive in excess). No other common matrix class provides all three under multiplication. Orthogonal matrices preserve norms but allow negative entries. Stochastic matrices (row sums = 1 only) are closed under multiplication but do not conserve column sums, allowing trust amplification at destinations.

Does the proof assume the matrices are square?

Yes. Doubly stochastic matrices are necessarily square -- they map n contexts to n contexts. This is appropriate for trust transfer, which redistributes trust among a fixed set of active contexts. If a new context appears (the robot enters a new room), the matrix dimension increases and all new entries start at the default (zero coherence, uniform small affinity). If a context is evicted (LRU), its accumulated coherence is recorded and the matrix dimension decreases.

How does periodic Sinkhorn re-projection affect the compositional guarantee?

Strictly speaking, re-projection breaks the exact composition chain. The matrix at tick k is not M^k but rather (SK-project(M))^m where m is the number of ticks since the last projection. However, the re-projected matrix is doubly stochastic by construction (Sinkhorn-Knopp output), so the compositional guarantee holds from each re-projection point forward. The re-projection resets accumulated floating-point error to below 10^-8, which is below the representational precision of f32. In practice, the guarantee holds continuously because the error never accumulates to a level that could influence any decision.

Can compositional closure hold for learned trust transfer functions?

If the learned function produces a doubly stochastic matrix, then yes -- the closure holds regardless of how the matrix was generated. A neural network could learn the affinity matrix, and Sinkhorn-Knopp would project it onto the Birkhoff polytope. The compositional guarantee holds for the projected output. This is one possible bridge between learned representations and provable safety: learn the affinities, project the transfers, and the algebraic guarantees hold. The ccf-core crate is designed to support this pattern through its generic SensorVocabulary trait -- the vocabulary can be backed by a learned model, and the downstream mathematics does not care.

What is the computational cost of verifying compositional closure at runtime?

Checking that a matrix is doubly stochastic requires summing each row and each column and verifying each sum equals 1 within tolerance. For an n-by-n matrix, this is O(n^2) -- the same cost as reading the matrix. In CCF, with n = 64 contexts maximum, this is 4096 additions and 128 comparisons. At modern CPU speeds, under one microsecond. You can verify the safety guarantee every tick at negligible cost. This is unique among safety architectures: the guarantee is not only provable, it is cheaply auditable at runtime.