Fleet Analytics Dashboard: Environment Taxonomy, Outlier Detection, and Quality Scoring

You manage 500 companion robots across 50 eldercare facilities. Each robot reports a daily fingerprint -- fewer than 200 bytes. Twenty scalar values summarising its operational world: how complex the environment is, how familiar the robot is with it, how social the context is, what the daily rhythm looks like.

From these 20 numbers per robot per day, the fleet analytics dashboard gives you four capabilities that would otherwise require camera feeds, microphone arrays, and a data engineering team: environment taxonomy discovery, outlier agent detection, cohort trajectory analysis, and environment quality scoring.

This post describes what each capability looks like in practice, how the underlying mathematics works, and why the entire system runs on the same fingerprint vectors used for individual robot monitoring.

The fleet analytics architecture is described in US Provisional 64/039,623, sections [0025]-[0026].

Capability 1: Environment Taxonomy Discovery

The first thing you want to know about a fleet is: what types of environments do my robots operate in?

You might think you already know. Rooms in an eldercare facility. Zones in a warehouse. Fields in an agricultural co-op. But the operational reality is richer than the administrative categories. Two "private rooms" in the same facility can have completely different fingerprint profiles if one has a window and the other does not, if one's resident receives frequent visitors and the other's does not, if one is near the nurses' station and the other is at the end of a quiet corridor.

Environment taxonomy discovery clusters fingerprint vectors and reveals the operational environment types that actually exist -- not the categories you assumed.

How it works. Collect the most recent fingerprint from each robot. Normalise the 20-dimensional vectors (each component has a different scale). Apply a clustering algorithm.

For known cluster counts, k-means works. For unknown counts, DBSCAN discovers clusters based on density without requiring a target count. For hierarchical taxonomy (e.g., "indoor" splits into "private room" and "common area," which split further into sub-types), agglomerative clustering with Ward linkage produces interpretable dendrograms.

What it reveals. In our simulation with three environments (seed 20260426), the fingerprint vectors are dramatically distinct:

| | Forest | Mars Habitat | Bedroom | |---|---|---|---| | Vocab |K| | 148 | 76 | 295 | | Phase I | 61.4% | 52.1% | 76.1% | | Phase II | 18.0% | 21.1% | 0.1% | | Density rho | 24.0% | 63.0% | 4.2% | | Mean fam mu_f | 0.31 | 0.38 | 0.12 | | Groups g | 20 | 14 | 9 |

Pairwise Jaccard distances: Forest-Mars 0.95, Forest-Bedroom 0.78, Mars-Bedroom 0.95. Any reasonable clustering algorithm separates these into distinct types with wide margins.

In a real eldercare deployment with 500 robots, you might discover 5-7 distinct environment types:

Quiet private rooms -- low vocabulary, high mean familiarity, low social density, morning-peaked rhythm.
Active private rooms -- moderate vocabulary, moderate familiarity, higher social density (visitors), broader temporal distribution.
Common areas -- high vocabulary, low familiarity (too many contexts to master), high social density, afternoon-peaked rhythm.
Corridors and transition zones -- moderate vocabulary, very low familiarity (transient contexts), uniform temporal rhythm.
Outdoor courtyards -- high vocabulary, low familiarity, very low social density, weather-dependent temporal rhythm.

Some of these you predicted. The split between quiet and active private rooms -- that came from the data. The robot told you something about the environment that the building floor plan did not.

Dashboard view. A scatter plot of the first two principal components of the fingerprint space, coloured by cluster assignment. Each dot is a robot. Clusters that overlap may need higher dimensionality or finer-grained analysis. Clusters with clear separation represent operationally distinct environment types.

Capability 2: Outlier Agent Detection

An outlier is a robot whose fingerprint diverges from its deployment cohort. The robot is supposed to be in a private room but its fingerprint looks like a common area. Or its fingerprint matched its cohort last week but has drifted over the past three days.

How it works. For each environment cluster, compute the centroid (mean fingerprint). For each robot in the cluster, compute the distance from its current fingerprint to the centroid. Robots whose distance exceeds a threshold (typically 2 standard deviations from the cluster mean distance) are flagged as outliers.

outlier_score(robot_i) = distance(fingerprint_i, centroid_cluster)
threshold = mean_distance + 2 * std_distance
flagged = outlier_score > threshold

The distance metric matters. Euclidean distance works for normalised vectors but treats all components equally. A weighted distance that emphasises the most discriminative components for a given cluster improves detection accuracy. In practice, the fleet server learns these weights from the cluster structure itself: components with low within-cluster variance are more diagnostic of outliers than components with high variance.

What it catches. Five categories of operationally relevant outliers:

Wrong deployment. Robot assigned to Room 204 but physically placed in Room 207. The fingerprint profile does not match Room 204's cluster. Detectable within one reporting period.
Sensor malfunction. A degraded proximity sensor produces abnormal vocabulary cardinality and presence patterns while other components remain normal. The partial deviation from the cluster centroid is distinctive -- full-profile deviation suggests environmental mismatch, partial deviation suggests hardware.
Environmental change. A room that was quiet (Cluster 1) has a new, frequently visiting family member. The fingerprint drifts toward Cluster 2 (active private room). This is not a problem with the robot -- it is the robot reporting a genuine change in its operational world.
Operational anomaly. A robot in a warehouse that normally shows uniform temporal rhythm suddenly shows morning-only activity. The robot's charge cycle may have changed, or a zone restriction may have been imposed.
Relocation detection. The most dramatic outlier signal. A robot moved from one environment type to another produces a fingerprint that is far from its original cluster and potentially near a different cluster. The Jaccard distance between the old and new fingerprints exceeds the within-cluster variance by a large margin.

Dashboard view. A ranked list of outlier robots, sorted by outlier score (distance from centroid). Each entry shows the robot ID, assigned cluster, outlier score, and which fingerprint components contribute most to the deviation. An operator can triage from the top of the list downward.

Capability 3: Cohort Trajectory Analysis

Individual fingerprints are snapshots. Fingerprint time series are trajectories. Cohort trajectory analysis compares how robots in the same environment type develop over time.

How it works. Group robots by environment cluster. For each robot, plot the time series of each fingerprint component. Overlay the cohort. Compute the mean trajectory and the envelope (e.g., 10th-90th percentile bands).

For each cluster C:
  For each component k in [|K|, p_I, ..., pi_ab]:
    trajectory(robot_i, k) = [fingerprint_i(t).k for t in time_series]
    mean_trajectory(C, k) = mean over robots in C
    envelope(C, k) = [p10, p90] over robots in C

What it reveals. Deployment maturity patterns. In a fleet of eldercare companion robots, you expect mean familiarity to rise from ~0.05 (first day) to ~0.3 (after two weeks) to a plateau around ~0.5 (after a month). This is the robot settling in.

Robots that reach plateau faster are in more stable environments. Robots that plateau at lower familiarity are in more variable environments. Robots that never plateau -- familiarity stays flat or oscillates -- may have deployment problems.

Cohort trajectory analysis answers questions like:

"How long does deployment take?" The median time for mean familiarity to reach 0.3 across a cohort. If Facility A averages 8 days and Facility B averages 18 days, Facility B's environments are harder to settle into. This might indicate layout issues, staffing variability, or sensor resolution mismatches.
"Are all robots in this cluster developing similarly?" A tight envelope (narrow p10-p90 band) means homogeneous experience. A wide envelope means some robots are having very different experiences despite similar environment types. The divergent robots deserve investigation.
"Is this cohort regressing?" A rising familiarity trajectory that suddenly reverses -- mean familiarity dropping across a cluster -- indicates environmental disruption. Renovations, staffing changes, seasonal shifts. The cohort trajectory catches it because the pattern appears across multiple robots simultaneously.

Dashboard view. Time series plots per cluster, per component. Mean line with shaded envelope. Individual robot trajectories overlaid in light opacity. Robots outside the envelope are highlighted. Operators can drill into any highlighted robot to see its individual trajectory and fingerprint details.

Capability 4: Environment Quality Scoring

This is the most commercially valuable capability: using robot fleet behaviour as a proxy for environment quality.

How it works. The phase distribution across robots in the same environment is a quality signal. Robots in stable, predictable environments accumulate Phase II (established familiarity) quickly. Robots in chaotic, poorly structured environments stay stuck in Phase I (unfamiliar).

quality_score(environment_e) = weighted_sum(
  phase_II_fraction * w_1,
  mean_familiarity * w_2,
  1 - familiarity_variance * w_3,
  temporal_rhythm_stability * w_4
)

The weights are configurable based on what "quality" means for the specific deployment. For eldercare, a high-quality environment might emphasise Phase II presence (the robot and resident have settled into a routine) and low familiarity variance (consistent experience). For warehouse operations, quality might emphasise high matrix density (efficient zone transitions) and uniform temporal rhythm (balanced shift utilisation).

What it reveals. Phase distributions from the simulation data illustrate the concept:

| Environment | Phase I | Phase II | Interpretation | |-------------|---------|----------|----------------| | Mars habitat | 52.1% | 21.1% | Structured, controlled, good quality | | Forest | 61.4% | 18.0% | Moderate, natural variation | | Bedroom | 76.1% | 0.1% | Complex, variable, challenging |

A room in an eldercare facility where the companion robot reaches 25% Phase II within two weeks scores higher than a room where the robot stays at 5% Phase II after a month. The robot is telling you that the first room provides a more stable, structured environment.

Scale this across 50 facilities with 500 robots:

Facilities where robots consistently reach Phase II faster have better environments (from the robot's operational perspective).
Individual rooms that score consistently low may have environmental factors that interfere with routine establishment.
Cross-facility comparison reveals which facilities create the best conditions for robot-assisted care, which could correlate with resident outcomes.

The key insight: the robot is not measuring environment quality directly. It is reporting its own operational experience. But a fleet of robots across many environments, all using the same behavioural architecture, produces a comparative quality measure that is remarkably informative. The robot that settles in quickly and deeply is in a good environment. The robot that struggles to accumulate familiarity is in a challenging one.

Dashboard view. A heatmap: facilities on one axis, rooms on the other. Colour-coded by quality score. Facilities can be ranked. Rooms within a facility can be ranked. Trends over time can be visualised by sliding the time window. Operators can click through to the underlying fingerprint data for any cell.

All From the Same 20 Numbers

The critical architectural point: all four capabilities -- taxonomy, outliers, trajectories, and quality scoring -- operate on the same fingerprint vectors used for individual robot monitoring. No additional data collection. No additional sensor processing. No additional bandwidth.

A fleet of 500 robots transmitting daily fingerprints produces fewer than 100 KB of data per day. A laptop running Python with scikit-learn can perform the clustering, outlier detection, trajectory analysis, and quality scoring in under a second. This is not a big data problem. It is a small data problem with big analytical value.

The fingerprint is described in the 8-component fingerprint post. The underlying CCF mathematics are covered in the Sinkhorn-Knopp post and the compositional closure proof. The store-and-forward mechanism that makes this work with intermittent connectivity is described in the store-and-forward post.

Implementation

The ccf-core on crates.io crate computes the fingerprint on-device. Fleet analytics runs server-side on the collected fingerprint vectors. The crate's serde feature enables serialisation to JSON, MessagePack, or binary formats for transmission and storage.

For fleet operators evaluating the architecture, the path is:

Deploy CCF-enabled robots (or retrofit existing robots with the ccf-core crate)
Configure daily fingerprint reporting via store-and-forward
Collect fingerprints at a fleet server
Apply standard clustering, outlier detection, and time series analysis to the fingerprint vectors
Build dashboards on the results

Step 4 uses off-the-shelf tools. scikit-learn for clustering. Pandas for time series. Plotly or Grafana for dashboards. The fingerprint format is simple enough that any analytics stack can consume it.

For the full architecture, see how it works. For patent details, see the patent page.

FAQ

Q: What fleet size is required for meaningful taxonomy discovery?

Taxonomy discovery benefits from diversity. With 10 robots in identical environments, clustering produces one cluster -- not very informative. With 50 robots across 3-5 environment types, clustering reliably discovers the types. The practical minimum for useful taxonomy work is around 30-50 robots with at least 3 distinct environment categories. Below that, manual classification is simpler.

Q: How frequently should outlier detection run?

With daily fingerprints, daily outlier detection is natural. The outlier score should use a rolling window -- the last 7 or 14 fingerprints rather than just the most recent one -- to avoid false alarms from transient anomalies. A robot that shows one outlier fingerprint after a building fire alarm is not a true outlier. A robot that shows three consecutive outlier fingerprints has a real issue.

Q: Can environment quality scores be gamed?

The quality score is derived from the robot's operational state, which is determined by the environment. To "game" the score, you would need to modify the environment itself -- which is precisely what the score is designed to incentivise. Making the environment more stable and predictable genuinely improves the quality score and genuinely improves the robot's operational experience. There is no shortcut that improves the score without improving the environment.

Q: How does this compare to traditional fleet monitoring systems?

Traditional fleet monitoring collects raw telemetry: battery levels, motor temperatures, error logs, location traces. The CCF fingerprint is complementary, not competing. It captures the robot's behavioural relationship with its environment -- something raw telemetry cannot measure. A robot with healthy battery and motors can still have a poor environment quality score. The two monitoring approaches together give a complete picture: hardware health from telemetry, operational health from fingerprints.

Q: What about robots that operate in genuinely novel environments?

A robot deployed in an environment type never seen by the fleet before will appear as an outlier against all existing clusters. This is correct behaviour -- the fleet server should flag it as "unclassified" rather than forcing it into the nearest cluster. As more robots deploy in the novel environment, a new cluster emerges naturally. The taxonomy grows with the fleet's experience.

Patent pending. US Provisional 64/039,623.

-- Colm Byrne, Founder -- Flout Labs, Galway, Ireland