Document Type: Architecture Specification
Context: Monitoring Layer · Quality Assurance
Status: Public Standard
Validity: Aivis-OS Core Pipeline
Reference: Validates the output of Machine Interface Layer & Projection Strategy, as well as its structural integrity along all upstream layers.
1. Architectural Problem
Probabilistic Output & the Ranking Fallacy
Conventional monitoring approaches (rankings, share of voice, position tracking) are based on the assumption of deterministic result lists.
However, generative AI systems (LLMs, Answer Engines) do not generate lists, but probabilistic answers based on vector space proximity, evidence density, and contextual coherence.
It follows that:
- “Positions” do not exist.
- Repeatability is not guaranteed.
- Visibility is a state, not a place.
Monitoring that exclusively analyzes the output textually (e.g., keyword matching) is subject to three systematic blind spots:
- Evidence Blindness: Correct answers may be based on guessing rather than knowledge.
- Semantic Blindness: Structural errors (incorrect relations) remain undetected as long as entities are named.
- Numerical Blindness: Numbers, time periods, and quotas are not reliably validated.
Conclusion: Output is a symptom, not a foundation. Aivis-OS defines monitoring not as ranking control, but as Structural Integrity Testing.
2. Monitoring Objective
The goal of Evidence Monitoring is not visibility, but semantic stability under probabilistic retrieval.
The measurement is not whether a company is mentioned, but how stable, correct, and verifiable its digital representation is retrievable.
3. The four dimensions of visibility
(4 Dimensions of AI Visibility)
Aivis-OS measures visibility along four qualitative states of entity representation.
3.1 Attribution Stability
(Identity Check)
Definition: The ability of the model to assign a fact to the correct entity without the entity being explicitly mentioned in the prompt (Zero-Mention Prompting).
Test: “Who offers a solution for problem X?”
Success: The correct entity is named.
Warning signal:
- Competitors are named
- Generic actors are hallucinated
Architectural Significance: Indicator of the strength of semantic vectorization and identity anchoring.
3.2 Entity Logic Integrity
(Relationship Check)
Definition: The correctness of the relations between entities reconstructed in the model.
Test:
“Which products belong to [brand]?”
“Who is a partner in the joint venture [name]?”
Success: Correct resolution of the edges modeled in the Semantic Graph.
Warning signal:
- Identity Drift
- Mixing with competitors
- Disambiguation errors
3.3 Evidence Consistency
(Proof Check)
Definition: The ability of the model to support statements with explicit, verifiable sources.
Test: “Name the source for this statement.”
Success: The model provides a URL or document that is defined as a Source of Truth in the Inventory.
Warning signal:
- Correct statement without source
- Hallucinated sources
- Non-existent or outdated URLs
3.4 Temporal & Numerical Precision
(Fact Check)
Definition:
Accuracy with non-linguistic data such as numbers, dates, quotas, or time periods.
Test:
“What was the revenue in 2023?”
“When was product X launched?”
Success:
Exact match with the Transport-Safe Content.
Warning signal:
- Approximated values
- Outdated data
- Statistically plausible but factually incorrect numbers (Token Hallucinations)
4. Test Methodology
The Iceberg Model
Aivis-OS uses a Dual-Layer Probing System to differentiate superficial visibility from structural resilience.
4.1 Layer A – User Simulation Prompts
(Surface)
Objective: Simulation of real usage scenarios.
Characteristic:
- Short
- Unclear
- Context-poor
Metric: Recall Rate (is the entity found at all?)
Example: “Best software for compliance?”
4.2 Layer B – Forensic Prompts
(Foundation)
Objective:
Verification of the semantic mechanics.
Characteristic:
- Structured
- Evidence-focused
- Adversarial
Metrics:
- Accuracy
- Citation Rate
Example: “List all compliance modules from [brand] with release date and link the documentation.”
4.3 The Integrity Gap
The difference between Layer A and Layer B is the central KPI.
- Case 1: User good · Forensic bad → Bubble Visibility (unstable)
- Case 2: User bad · Forensic good → Hidden Potential (architecture present, transport weak)
- Case 3: Both good → Aivis Certified Visibility
5. Scoring Model
Source Anchoring Score (SAS)
Linear rankings are replaced by the Source Anchoring Score (0.0 – 1.0).
Calculation:
SAS = Attribution_Weight × Integrity_Weight × Citation_Rate
Interpretation:
- SAS < 0.5
Critical instability – the model is guessing. - SAS ≥ 0.9
Deterministic anchoring – the model “knows.”
6. Feedback Loop
Monitoring as Remediation Trigger
In Aivis-OS, monitoring is not a reporting artifact, but a trigger for architectural corrections.
| Error pattern | Architectural correction |
|---|---|
| Incorrect source | Verification of the sameAs links in the Semantic Graph |
| Incorrect numbers | Revision of the Transport-Safe Content Structure |
| Missing hierarchy | Hardening of the JSON-LD @graph nesting in the MIL |
Each monitoring finding can be traced back to a specific layer.
Summary
The concept of ranking is epistemically unusable in LLM systems. Aivis-OS replaces the hunt for positions with the securing of source anchoring. Evidence Monitoring does not check whether a brand is “at the top”, but whether its digital representation structurally survives probabilistic retrieval unscathed.
Architecture Overview

Cluster-Level Entity Inventory Strategy

Semantic Graph Layer

Semantic Graph Engineering

Machine Interface Layer & Projection Strategy

Transport-Safe Content Layer

Transport-Safe Content Engineering
