How a Robot Sees the World: The 6 Sensor Thresholds That Create Context

A robot has no opinions about its environment. It has sensor readings. A photoresistor returns 237 lux. An ultrasonic rangefinder returns 84 centimetres. A microphone returns 52 dBA. These are real numbers on continuous scales, and they change every tick.

But the robot's SITUATION does not change every tick. You are still in the kitchen. The lights are still on. Nobody new has entered the room. The meaningful state of the environment is discrete and stable. The sensor values fluctuate around it.

This is the quantization problem: how do you map continuous sensor readings to discrete context identifiers that are stable enough to accumulate trust against? If the context key changes every tick, no trust accumulates. If the context key never changes, the robot cannot distinguish situations.

CCF solves this with a six-dimensional quantization scheme described in patent section [0033a] of the supplement to US 63/988,438. Six sensor dimensions, each mapped to three or four discrete bands. The composite context key is the concatenation of all six bands. Here are the exact thresholds.

Dimension 1: Ambient Light

Three bands. Measured in lux from a photoresistor or ambient light sensor.

dark:     lux < 50
dim:      50 <= lux <= 300
bright:   lux > 300

Why these numbers? 50 lux is the threshold below which most people consider a room "dark" -- a hallway at night with a nightlight. 300 lux is the boundary above which environments feel well-lit -- a kitchen with overhead lights on. The band between 50 and 300 covers the range where rooms are occupied but not brightly lit: living rooms in the evening, bedrooms with a reading lamp.

For the robot, these three states map to qualitatively different interaction environments. In a dark room, people are likely sleeping or away. In a dim room, they may be relaxed, watching television, winding down. In a bright room, they are active. The robot's behavioural envelope should differ across all three.

Dimension 2: Sound Level

Three bands. Measured in dBA from an on-board microphone.

quiet:    dBA < 40
moderate: 40 <= dBA <= 65
loud:     dBA > 65

40 dBA is the ambient noise floor of a quiet residential room -- a refrigerator hum, distant traffic. Below this, the environment is silent. Between 40 and 65 dBA, you have normal conversation (60 dBA), a television at moderate volume, kitchen activity. Above 65 dBA, you have raised voices, a blender, a vacuum cleaner, or a group of people talking simultaneously.

For trust accumulation, the quiet/moderate/loud distinction maps to engagement levels. A quiet room has no active interaction. A moderate room has conversational interaction. A loud room has high-energy activity where the robot should be more cautious about initiating new behaviours.

Dimension 3: Presence (Ultrasonic)

Four bands. This is the most complex dimension because it uses both distance and its first derivative (rate of change).

approaching: distance < 150cm AND d(distance)/dt < 0
retreating:  d(distance)/dt > 0 AND prior_distance < 150cm
static:      distance < 150cm AND |d(distance)/dt| < 2 cm/s
absent:      distance >= 150cm

150 centimetres is the proxemic boundary. Within 150 cm, a person is in the robot's social or personal space. Beyond 150 cm, they are not meaningfully present in the interaction.

The derivative thresholds matter. An approaching person at 120 cm with negative distance derivative is qualitatively different from a static person at 120 cm. The approaching case is an unresolved situation -- the person may be reaching for the robot, walking past, or closing in for interaction. The static case is resolved -- they are present and settled.

The retreating state requires that prior distance was below 150 cm. This distinguishes "person leaving the robot's space" from "person walking past at 200 cm". Only the former is meaningful for trust dynamics -- someone who was engaged is now disengaging.

The 2 cm/s threshold for static classification absorbs sensor noise and minor body sway. A person sitting 80 cm away with natural postural adjustments stays classified as "static" rather than flickering between approaching and retreating.

Dimension 4: Motion (Accelerometer)

Three bands. Measured from a three-axis accelerometer on the robot chassis.

stationary:  |accel - baseline| < 0.1g for 2+ seconds
being-moved: |accel - baseline| > 0.3g AND no motor correlation
self-moving: |accel - baseline| correlated with motor commands

0.1g is the noise floor for a MEMS accelerometer at rest. Below this deviation from the gravity baseline, nothing is happening. The 2-second persistence requirement prevents transient vibrations (a door closing, someone setting a cup on the table) from changing the context.

0.3g with no motor correlation means something is physically moving the robot that the robot did not initiate. Someone picked it up. Someone pushed it. The bookshelf it was sitting on collapsed. This is a critical distinction: self-initiated motion is expected and accounted for. Externally imposed motion is an environmental change that should modulate trust.

Motor correlation is computed by cross-correlating the accelerometer signal with the motor command history over a 500 ms window. If the correlation coefficient exceeds 0.7, the motion is classified as self-moving. Below 0.7 with magnitude above 0.3g, it is being-moved.

Dimension 5: Orientation (Gyroscope)

Three bands. Measured from a three-axis gyroscope, referenced to the gravitational axis.

upright:       gravitational axis deviation < 15 degrees
tilted:        15 <= deviation <= 60 degrees
being-handled: angular velocity > 30 degrees/second

15 degrees is the range of normal surface variation. A robot on a slightly uneven table, or one that has been placed at a minor angle, is still "upright" for behavioural purposes. Beyond 15 degrees, the robot is tilted -- perhaps placed on its side, wedged against something, or on a sloped surface.

The being-handled state overrides the static angle classification. If angular velocity exceeds 30 degrees per second, someone is actively rotating the robot -- picking it up, turning it, examining it. This is fundamentally different from being tilted and stationary. A tilted robot can operate normally with adjusted expectations. A being-handled robot should suppress motor output entirely.

Dimension 6: Time Period

Four bands. Derived from the system clock.

morning: 06:00 - 12:00
midday:  12:00 - 17:00
evening: 17:00 - 21:00
night:   21:00 - 06:00

Time of day is not a sensor reading, but it is a context dimension. The same physical environment -- same light, same sound, same presence pattern -- has different behavioural expectations at 10:00 versus 22:00. Morning activity in a bright kitchen is routine. The same sensor profile at night might indicate something unusual.

The four-band split aligns with domestic routine patterns: morning preparation, daytime activity, evening winding down, overnight quiet. For non-domestic deployments (warehouse, hospital), these boundaries should be recalibrated to match the operational schedule.

The Composite Context Key

Total possible context keys: 3 x 3 x 4 x 3 x 3 x 4 = 1,296.

The context key is the concatenation of all six bands:

let key = format!("{}:{}:{}:{}:{}:{}",
    q_brightness(lux),      // "dark" | "dim" | "bright"
    q_sound(dba),           // "quiet" | "moderate" | "loud"
    q_presence(dist, deriv),// "approaching" | "retreating" | "static" | "absent"
    q_motion(accel, motors),// "stationary" | "being-moved" | "self-moving"
    q_orient(gyro),         // "upright" | "tilted" | "being-handled"
    q_period(hour)          // "morning" | "midday" | "evening" | "night"
);

This key is hashed to a 32-bit integer via context_hash_u32() in the SensorVocabulary trait, producing the ContextKey that indexes into the CoherenceField's accumulator map.

But 1,296 is the theoretical maximum. A domestic robot typically encounters 100 to 300 of these. Most combinations never occur in practice. There is no "dark:loud:approaching:being-moved:being-handled:night" in a normal kitchen. There is no "bright:quiet:static:self-moving:upright:morning" because if the robot is self-moving, the environment is rarely quiet.

The sparsity is a feature, not a bug. The robot's vocabulary cardinality -- how many distinct context keys it has encountered -- is itself a diagnostic metric. A kitchen robot with 400+ distinct contexts is either in an unusual environment or has a sensor calibration problem. This vocabulary cardinality becomes the first component of the fleet analytics fingerprint.

Why Quantization, Not Clustering

An alternative approach would be to run k-means or DBSCAN on the raw sensor vectors and let the algorithm discover natural clusters. This is a bad idea for three reasons.

Determinism. Clustering algorithms depend on initialisation, convergence criteria, and the data seen so far. Two robots with identical sensor hardware in the same room could discover different clusters. The context keys would be incomparable across devices. Fleet analytics becomes impossible.

Stability. Clustering boundaries shift as new data arrives. A context key that existed yesterday might be absorbed into a neighbouring cluster today. The trust accumulated against the old key is orphaned. Quantization boundaries are fixed -- a reading of 49 lux is "dark" today and "dark" tomorrow, regardless of what other readings the robot has seen.

Interpretability. "dim:moderate:static:stationary:upright:evening" tells a human operator exactly what the robot is experiencing. "cluster_17" tells them nothing. When debugging trust dynamics or auditing safety behaviour, interpretable context keys are worth more than any marginal improvement in cluster quality.

Calibration for Non-Domestic Environments

The thresholds above are DEFAULTS for indoor domestic deployment. Other environments need different numbers.

Agricultural / outdoor. Light thresholds shift dramatically. Outdoor daylight ranges from 1,000 lux (overcast) to 100,000 lux (direct sun). The dark/dim/bright boundaries might become 200/2000/20000 lux. Sound thresholds shift for wind noise (a steady 50 dBA outdoors is "quiet").

Underwater. The ultrasonic presence dimension uses sonar ranges instead of air-coupled ultrasonic. Propagation characteristics change: detection range extends to tens of metres but with different noise profiles. The distance threshold might increase from 150 cm to 500 cm.

Warehouse / industrial. Motion thresholds must account for conveyor vibration and forklift traffic. The 0.1g stationary threshold might increase to 0.3g, with the being-moved threshold increasing to 0.6g.

Hospital. Time periods map to shift changes rather than domestic routines: day shift (07:00-19:00), night shift (19:00-07:00), shift transitions (30 minutes around each boundary).

The SensorVocabulary trait in ccf-core on crates.io is generic over these parameters. You implement context_hash_u32() and cosine_similarity() for your sensor suite, with your thresholds. The CCF runtime does not know or care what hardware it is running on. The quantization is entirely in the vocabulary implementation.

From Context Key to Trust

The context key is where trust begins. Once a sensor reading is quantized to a key, the CoherenceAccumulator begins its work: positive interactions increment the accumulator, decay applies between interactions, the floor prevents total trust loss, and the minimum gate ensures that instantaneous instability always dominates accumulated trust.

The quantization thresholds determine what the robot considers a "situation". The accumulator dynamics determine how fast the robot builds trust in each situation. And the Sinkhorn-Knopp projector determines how trust transfers between situations. All three layers depend on the context key being stable, interpretable, and deterministic.

The full claim structure covering composite context keys is in Claims 1 and 8 of the patent. The SensorVocabulary trait and ContextKey type are implemented in ccf-core on crates.io, available for evaluation under BSL 1.1.

See also Compositional Closure for why the trust transfer guarantees hold indefinitely once contexts are established.

— Colm Byrne, Founder — Flout Labs, Galway, Ireland

Patent pending. US Provisional 64/039,626.

FAQ

Why six dimensions and not more? What about temperature, humidity, or camera-based features?

Six dimensions were chosen because they cover the full proxemic interaction space with sensors available on a $50 robot (the mBot2). Temperature and humidity change too slowly to meaningfully distinguish interaction contexts -- they are environmental constants, not situational variables. Camera-based features (face detection, pose estimation) introduce privacy concerns and computational cost that are unnecessary for behavioural gating. The six-dimensional scheme runs on an ARM Cortex-M microcontroller. If your platform has additional sensors, add dimensions to the vocabulary -- the SensorVocabulary trait is designed for extension. But the combinatorial explosion means each new 3-band dimension multiplies the key space by 3. Seven dimensions gives 3,888 keys. Eight gives 11,664. The sparsity increases and the time to accumulate meaningful trust in any single context grows proportionally.

What happens at the boundary between two bands? Does the context key flicker?

Boundary flickering is the main failure mode of naive quantization. A light reading oscillating between 49 and 51 lux would flicker between "dark" and "dim" every tick. CCF handles this with hysteresis on each boundary: the transition from dark to dim requires exceeding 55 lux, while the transition from dim to dark requires dropping below 45 lux. The 10% hysteresis band absorbs sensor noise without requiring temporal filtering. This is the same Schmitt trigger pattern used in the social phase classifier -- a proven approach to stable state transitions in noisy environments.

How does the cosine similarity between context keys work if the keys are discrete?

The cosine_similarity() method on the SensorVocabulary trait computes similarity between context keys by treating each dimension as a position on an ordinal scale. "dark" = 0, "dim" = 1, "bright" = 2. The key is represented as a 6-element vector and the cosine similarity between two such vectors gives a continuous measure of context relatedness. "dim:quiet:static:stationary:upright:morning" is more similar to "bright:quiet:static:stationary:upright:morning" (one dimension differs by one band) than to "dark:loud:approaching:being-moved:being-handled:night" (all six dimensions differ). This similarity drives the affinity matrix input to the Sinkhorn-Knopp projector, determining how much trust transfers between contexts.

Can an adversary game the quantization by controlling the sensor environment?

An adversary who can control the light, sound, presence, motion, orientation, and time signals received by the robot can force any context key they want. This is by design -- the context key reflects the environment as the robot perceives it. If the adversary creates a kitchen-like environment, the robot will classify it as kitchen-like. The protection is in the trust dynamics, not the quantization. Creating a familiar context key does not grant trust -- trust must be ACCUMULATED over time through consistent positive interactions in that context. Our simulation (Sim 3: trust farming) shows that even with full environmental control, achieving meaningful privilege escalation requires 141 days of sustained, consistent manipulation. The quantization is honest. The safety comes from the accumulator dynamics and the minimum gate.

What is the computational cost of context key generation?

Negligible. Six comparisons (one per dimension), each against two or three thresholds, followed by a hash of the resulting string or tuple. Total: approximately 20 comparisons and one hash per tick. On an ARM Cortex-M4 at 168 MHz, this is under 1 microsecond. The quantization runs inside the sensor interrupt handler. The CCF runtime never sees raw sensor values -- only context keys.