Explainable But Private: How a Robot's Audit Trail Respects Privacy Mode
There is a contradiction at the heart of every auditable AI system. Auditors need to see everything. Users need to hide some things. The standard resolution is access control: store everything, restrict who can read it. This is a social solution to a technical problem, and it fails precisely when it matters most. The data exists. It can be subpoenaed. It can be breached. It can be leaked. The promise of privacy is a policy, enforced by organisational discipline, revocable at any time.
CCF resolves the contradiction differently. During privacy mode, the causation packet -- the 12-field record of exactly why the robot acted as it did -- records metadata only. It records THAT privacy was active. It records WHAT changed in the robot's behaviour. It does not record what happened during private time. Not because the data is access-controlled. Because the data does not exist. There is no sensor content to restrict access to. The data structure during privacy events has no content field.
The mechanism is described in Claims AN and AO of US Provisional 64/039,655, sections [E7-0009] through [E7-0011].
What Privacy Mode Actually Disconnects
When a user invokes privacy mode -- through a voice command, a physical button, or a scheduled window -- four things happen in sequence:
1. Hardware disconnection. A physical relay disconnects the microphone signal path. A servo closes the camera shutter. These are mechanical operations, visible to the naked eye. The shutter is either open or closed. A seven-year-old can verify it. This is the Kitchen Table Test: if a parent at the kitchen table cannot verify the safety property without reading documentation, the property is insufficiently robust.
2. LED state change. The LED switches to a distinct privacy-mode colour (default: amber). This colour is reserved for privacy mode and is never used for any other state. The visual signal is unambiguous: amber means the robot's sensors are physically disconnected.
3. Sensor processing halt. The software pipeline that reads sensor data, computes context keys, and updates coherence accumulators stops processing new input. The last pre-privacy sensor state is frozen. No new context keys are generated. No new coherence updates occur. The accumulator state at the moment of privacy entry is preserved unchanged.
4. Null-content trust event. A causation packet is emitted with a specific structure:
event: PRIVACY_ENTRY
timestamp: tick_22847
context_key: [last active key before privacy]
C_inst: [frozen]
C_ctx: [frozen]
C_eff: [frozen]
content: NULL (field absent by construction)
privacy: true
The content field is not set to null. It is absent from the data structure during privacy events. The Rust type system enforces this: the privacy-event variant of the causation packet enum does not include a content field. You cannot accidentally store content because there is nowhere to put it. This is structural privacy, not policy privacy.
For the null-content trust event mechanism and how it actually increases accumulated trust, see The Privacy Paradox.
What the Audit Trail Records During Privacy
The audit trail does not go dark during privacy mode. It records five categories of metadata:
Privacy entry and exit timestamps. When privacy mode started and ended. The duration is recorded to the tick.
Behavioural state at entry and exit. Which social phase the robot was in when privacy was invoked, and which phase it was in when privacy ended. If the phase changed during privacy (due to scheduled decay of the frozen state), the transition is recorded.
Output envelope changes. During privacy mode, the robot's behavioural output is constrained to privacy-mode minimums. If the robot was in Phase II (ConfidentCompanion) with motor amplitude 0.65 before privacy entry, and privacy mode reduces all outputs to 0.10, the envelope change is recorded. The auditor can see: outputs dropped at privacy entry, outputs restored at privacy exit.
Trust increment from privacy invocation. The null-content trust event increments the interaction count for the active context key and applies a rarity-scaled trust increment:
trust_increment = base_rate * recovery_speed * (1.0 - current_value) * rarity_factor
rarity_factor = 1.0 - (privacy_count / total_count)
The increment value is recorded. The auditor can see: the act of requesting privacy contributed this much to accumulated trust.
Cryptographic hash continuation. The causation packet chain does not break during privacy mode. Each privacy-mode packet includes the hash of the previous packet, maintaining the tamper-evident chain. An auditor reviewing the full chain sees a continuous, unbroken sequence. The privacy packets are distinguishable by their event type and their structural absence of content, but they occupy their correct position in the chronological chain.
What the Audit Trail Does NOT Record During Privacy
No audio. No video. No sensor readings. No context key updates. No coherence accumulator changes (beyond the rarity-scaled increment). No compiled routine evaluations. No conflict resolutions. No donor mixing. No bridge evaluations.
The robot is still physically present. It may still be moving (at privacy-mode minimum amplitude). Its LED is still on (amber). But its sensory relationship with the environment is disconnected. There is nothing to record because there is nothing to process.
This is not data minimisation. Data minimisation reduces the volume of collected data. This is data absence. The collection mechanism is physically disconnected. The data does not exist in any form, minimised or otherwise.
The Hospital Compliance Scenario
St. Brendan's Rehabilitation Unit has deployed companion robots in patient rooms. The robots assist with encouragement during physical therapy, provide ambient companionship during rest periods, and alert staff if certain behavioural patterns suggest distress. The hospital's compliance officer needs to verify two things simultaneously:
Requirement 1: The robot's behaviour must be explainable. If a patient's family asks "why did the robot reduce its engagement at 3pm?", the hospital must be able to answer with specifics.
Requirement 2: Patient privacy must be protected. Intimate care, private conversations, and family visits must not be recorded or stored.
In every other system, these requirements conflict. Explainability requires data. Privacy restricts data. The hospital must choose a tradeoff.
With CCF, both requirements are met fully.
The patient's morning therapy session (no privacy mode): full causation packets. Every phase transition, every routine execution, every envelope change is recorded with 12 fields. The therapist can see exactly why the robot increased encouragement at minute 8 (tension dropped, C_inst rose, Phase II engaged) and why it pulled back at minute 14 (patient voice amplitude increased, tension spiked, reflexive pathway overrode therapy routine). For the full causation packet walkthrough, see Why Did the Robot Back Away?.
The patient's afternoon rest (privacy mode activated): the causation trail records:
event: PRIVACY_ENTRY
timestamp: tick_31204
context_key: "room_204:afternoon:quiet"
pre_privacy_phase: Phase_II
output_change: {motor: 0.55 -> 0.10, LED: 0.60 -> 0.15, audio: 0.50 -> 0.08}
trust_increment: +0.003
privacy: true
The family visits during this window. A private conversation occurs. The robot is present, amber LED visible, barely moving. At 4:45pm:
event: PRIVACY_EXIT
timestamp: tick_38976
duration: 7772 ticks (approximately 2 hours 10 minutes)
post_privacy_phase: Phase_II
output_change: {motor: 0.10 -> 0.52, LED: 0.15 -> 0.57, audio: 0.08 -> 0.48}
trust_increment: +0.003
privacy: false
The compliance officer reviews the trail. She sees: privacy was active for 2 hours 10 minutes during the afternoon. Behavioural outputs were reduced to minimums. One trust increment occurred at entry, one at exit. No content was stored during the window. The robot resumed normal operation after privacy ended at a level slightly higher than before (trust increments from the privacy events).
She can also verify the chain. The hash of the PRIVACY_ENTRY packet links to the hash of the preceding packet (the last therapy event). The hash of the PRIVACY_EXIT packet links to the PRIVACY_ENTRY hash. No packets are missing. No packets were inserted. The sequence is tamper-evident.
If the family later asks "why did the robot seem less engaged during our visit?", the answer is specific: privacy mode was active. The robot reduced outputs to minimums as required by the privacy protocol. This is a complete and truthful explanation that reveals nothing about what happened during the private time.
Tamper-Evidence at the Packet Level
Each causation packet contains:
packet_hash = H(previous_hash || event_data || software_version || quantisation_table_hash || routine_registry_hash)
Where H is a cryptographic hash function (SHA-256 in the reference implementation). The software_version records the exact firmware version that produced the behaviour. The quantisation_table_hash records the active sensor quantisation configuration. The routine_registry_hash records which compiled routines were loaded.
This chain has three properties:
Chronological integrity. Packets are linked in order. Deleting a packet breaks the hash chain from that point forward. Inserting a packet requires recomputing all subsequent hashes. Both operations are detectable by any party holding a checkpoint hash (e.g., a hash recorded at the end of each day by the hospital's compliance system).
Version traceability. If the firmware was updated between two events, the software version field changes and the hash chain records the transition. An auditor can verify that behaviour X was produced by firmware version A, and behaviour Y was produced by firmware version B. If a firmware update changed a phase boundary or a decay rate, the audit trail shows the exact point of transition.
Configuration attestation. The quantisation table hash and routine registry hash attest to the specific configuration that was active. If a compiled routine was modified ("we updated the greeting to be more gentle"), the hash changes and the trail records when the old routine was replaced by the new one. Configuration drift is visible and auditable.
The Human Review Interface
A surprising behaviour has been flagged. Perhaps the robot reduced engagement abruptly, or entered Phase III (ProtectiveGuardian) in a context where the expected phase was Phase II. A human reviewer opens the incident in the review interface.
The interface shows the causation packet chain around the event. The reviewer sees:
- Which architectural artifact caused the behaviour (insufficient familiarity? incorrect sponsor bridge? stale compiled routine? over-broad context merge? unstable boundary?)
- The exact values at the moment of the event
- The transition from the prior state
- Whether privacy mode was active at any point during the review window
If the cause is a stale compiled routine (the routine was compiled under different environmental conditions and no longer matches the context), the reviewer modifies THAT SPECIFIC ROUTINE. They do not retrain the entire system. They do not reset the coherence field. They update one artifact, and the hash chain records the update.
If the cause is insufficient familiarity (the robot has not had enough interactions in this context), the reviewer can check the counterfactual: how many more interactions are needed to reach the desired phase? The shadow simulator provides the answer. For the counterfactual mechanism, see 'It Would Have Been Warmer If...'.
If the cause is a privacy-mode interaction that the reviewer cannot inspect (because the content does not exist), the reviewer sees the metadata: privacy was active, outputs were reduced, trust increments were applied. The reviewer knows that the behaviour was governed by the privacy protocol, and the specific cause within the privacy window is structurally unknowable. This is the correct outcome. The audit trail explains the behaviour without compromising the privacy.
The Regulatory Argument
Data protection regulations (GDPR, HIPAA, Australia's Privacy Act, Japan's APPI) share a common structure: personal data must be collected with consent, stored with purpose limitation, and deletable on request. The enforcement mechanism is always: the data exists, and the regulation constrains what you do with it.
CCF's privacy mode inverts this. During privacy, the data does not exist. There is no personal data to consent to, limit the purpose of, or delete. The metadata (timestamps, behavioural states, trust increments) is not personal data -- it describes the robot's state, not the person's actions.
Consider a GDPR subject access request: "Show me all data the robot collected about me." The response: "During non-privacy periods, the robot recorded quantised sensor composites (not personally identifiable), coherence values, and behavioural outputs. During privacy periods, the robot recorded timestamps and its own behavioural state changes. No audio, video, or sensor content was collected during privacy windows. The data structure during privacy events does not contain a content field."
This is a clean regulatory position. Not "we collected the data but we promise not to misuse it." Rather: "the data does not exist." The hardware disconnection is the evidence. The mechanical shutter is the proof. The absent content field is the structural guarantee.
Why Structural Privacy Matters More Than Policy Privacy
Every smart device on the market implements policy privacy. The microphone is software-muted. The camera is software-disabled. The recording is software-deleted. In each case, the enforcement mechanism is software, which means:
- A firmware update can change the behaviour without changing the visible indicator.
- A vulnerability can bypass the muting.
- A legal order can compel the manufacturer to change the policy.
- The user cannot verify the claim without technical expertise.
CCF's structural privacy is different:
- The mechanical shutter is visible. The relay is either connected or disconnected.
- A vulnerability cannot bypass a physical disconnection.
- A legal order cannot compel data that was never collected.
- A seven-year-old can verify the claim by looking at the shutter.
This distinction is not academic. It is the difference between "we promise your data is safe" and "your data does not exist." In healthcare, in eldercare, in special education -- in every deployment where trust between the system and its users is not optional -- the structural guarantee is the only guarantee that holds under adversarial conditions.
The full implementation is available in ccf-core on crates.io. For the null-content trust increment during privacy, see The Privacy Paradox. For the complete causation packet architecture, see Why Did the Robot Back Away?. For fleet-level privacy ratios and the 1,110:1 result, see The 1,110:1 Privacy Ratio.
FAQ
Can a court order compel disclosure of what happened during privacy mode?
A court order can compel disclosure of all data that exists. During privacy mode, the causation packet records metadata only: timestamps, behavioural state changes, trust increments. No sensor content exists. A court order for "all data collected by the robot during the privacy window" would receive the metadata packets -- which show THAT privacy was active and WHAT the robot's behavioural state was, but contain no audio, video, or environmental content. The structural absence of a content field is not a refusal to produce data. It is a factual assertion that the data was never collected.
What if the hardware privacy mechanism fails -- the relay sticks, the shutter jams?
The system monitors the hardware privacy state. If the software commands the relay to disconnect and the relay does not confirm disconnection (via a sense pin), the system enters a fail-safe mode: all sensor processing halts regardless of relay state, the LED flashes a distinct error pattern, and a hardware fault is logged. The fail-safe is not dependent on the relay working. It is a software backstop that activates when the hardware signal is ambiguous. The important property is that failure modes are visible and recorded, not silent.
Does privacy mode affect the robot's long-term learning?
Privacy mode freezes accumulator updates. No new interactions are recorded (except the null-content trust event). If a user activates privacy mode for 80% of the time, the robot accumulates trust much more slowly -- it is only learning during the 20% of non-private operation. This is a feature, not a limitation. The user is choosing a slower trust-building process in exchange for privacy. The tradeoff is transparent and under the user's control.
Can the metadata packets be correlated to infer what happened during privacy?
The metadata records timing and behavioural state, not environmental content. A sophisticated analyst might infer that "privacy mode was activated 5 minutes after a visitor arrived and deactivated 10 minutes after they left," but this inference comes from external knowledge (visitor logs), not from the robot's data. The robot's metadata alone shows: privacy started at tick X, ended at tick Y. Without external correlation, the privacy window is opaque. This is the same limitation that applies to any timestamp: knowing WHEN something happened is not the same as knowing WHAT happened.
How does privacy mode interact with the sponsor bridge mechanism?
If a sponsor bridge is active when privacy mode is invoked, the bridge state is frozen. Bridge decay does not occur during privacy mode (because decay requires active tick processing, which is halted). When privacy mode ends, bridge decay resumes from the frozen state. This means privacy mode effectively pauses bridge expiration. A bridge that would have decayed to zero during a two-hour privacy window still has its pre-privacy mass when privacy ends. This is a consequence of the design, not a special case: all time-dependent processes halt during privacy because the tick counter for trust operations is frozen.
Patent pending. US Provisional 64/039,655.
-- Colm Byrne, Founder -- Flout Labs, Galway, Ireland