Document Type: Architecture Specification
Context: Entity Inventory, Entity Truth Layer & Machine Interface Layer
Status: Public Standard Validity: Aivis-OS Core Pipeline
1. Architectural Principle
Global Identity vs. Local Mention
In the Aivis-OS architecture, the identity of an entity is strictly separated from its mention.
The core problem of traditional SEO or schema approaches is the generation of data at the URL level (Per-URL Inventory). This approach inevitably leads to Identity Drift, as it implicitly treats local mentions as independent identities.
Instead, Aivis-OS enforces a Cluster-Level Inventory. The entity inventory is not an artifact of a single page. It is an artifact of the knowledge domain.
Definitions
- Entity (Identity): A stable, canonical object (e.g., “Allianz SE”, “John William Doe”, “2024 Annual Report”) that exists globally within the cluster.
- Mention (Occurrence): A localized reference within a URL (e.g., “The Group”, “Mr. Doe”, “The Report”).
- Stable Anchor: An externally verifiable identifier to reduce ambiguity (Wikidata QID, LEI, ISIN, ORCID).
2. The Anti-Pattern: Risks of Per-URL Inventories
When systems attempt to build entities in isolation per URL, structural defects arise that prevent a stable AI representation:
2.1 Fragmentation of Identity (Duplicate Identities)
Scenario: An organization appears on 40 pages under variations such as “Allianz”, “Allianz SE”, and “Allianz Group”. Error: A per-URL approach generates 40 competing “truths.” AI models cannot merge these deterministically.
2.2 Instability of IDs
If IDs are generated (minted) per URL, they cannot remain stable cluster-wide. This destroys graph cohesion and makes long-term diffing (change tracking) impossible.
2.3 Dispersion of Verification (Anchor Verification Drift)
Different URLs often assign contradictory or missing anchors (e.g., incorrect social profiles or QIDs) to the same entity. The result is an inconsistent knowledge graph that LLMs devalue due to inconsistency (“Ambiguity Penalty”).
3. The Aivis-OS Solution: Cluster-Level Governance
The Aivis-OS inventory functions as a Single Source of Truth (“Golden Record”). Structured data at the URL level is merely a projection of this truth.
System Benefits
- Deterministic Entity Resolution: Normalization and deduplication are performed centrally and once.
- Managed Stable Anchors: External references are verified once and inherited globally.
- Aggregation: Attributes from various sources are enriched into a rich object.
- Governance & Versioning: Changes to an entity (e.g., name change) are propagated atomically, not distributed manually across thousands of pages.
4. Data Model & Implementation
4.1 Cluster-Level Tables (Identity Layer)
This is the storage location of the truth.
| Attribute | Description |
entity_id | Globally unique ID (see schema below) |
canonical_name | The official designation (e.g., “Allianz SE”) |
schema_type | Schema.org type (e.g., Corporation, Person) |
stable_anchors | Wikidata QID, LEI, ISIN, DOI |
provenance | Origin of data / Verification Status |
version | Hash of the current state |
4.2 URL-Level Tables (Context Layer)
This is the storage location of the reference.
| Attribute | Description |
url_id | Reference to the Page |
entity_id | Foreign Key to the Cluster Entity |
role_hint | Semantic role (e.g. mainEntity, author, mentions) |
4.3 ID-Minting Convention
Aivis-OS never generates new entity IDs during the JSON-LD generation of a single page. IDs follow a deterministic format:
entity://{cluster_id}/{schema_type}/{slug}-{short_hash}
5. Implementation Example
Instead of defining variants of a person (“John Doe”, “J. Doe”) multiple times, Aivis-OS references the central object.
Cluster Inventory (Backend): There is only one record for “John William Doe” with the linked Wikidata ID Q123456.
URL Projection (Output JSON-LD): The team page does not define a new person, but references the existing one:
JSON
{
"@context": "https://schema.org",
"@graph": [
{
"@id": "entity://cluster123/Person/john-william-doe-a1b2c3",
"@type": "Person",
"name": "John William Doe",
"sameAs": ["https://www.wikidata.org/wiki/Q123456"]
},
{
"@type": "WebPage",
"@id": "https://example.com/team/john-doe",
"about": {
"@id": "entity://cluster123/Person/john-william-doe-a1b2c3"
}
}
]
}
Note: No matter which URL this person appears on, the @id remains mathematically identical. This allows AI systems to assemble the graph flawlessly.
6. Operational Flow (Pipeline)
The Aivis-OS software processes data in a strict sequence to avoid contamination:
- Ingest & Extraction: Scanning all URLs for entity candidates.
- Normalization (Staging): Cluster-wide cleaning of name variants.
- Merge & Deduplication: Merging identical entities into a Golden Record.
- Anchor Verification: Validation of external IDs (Wikidata, etc.) against trusted sources.
- Freeze: Versioning of the inventory.
- Projection: Generation of the JSON-LD for the individual pages based on the frozen inventory.
7. Decision Criteria (Acceptance)
A correctly implemented Aivis-OS cluster fulfills the following metrics:
ID Stability: A repeated run of the pipeline does not change existing IDs.
Deduplication Rate: Variants converge towards 1 (n:1 Mapping).
Anchor Uniqueness: 1 external anchor (e.g. QID) is assigned to a maximum of one entity.
Referential Integrity: Every
@idoutput in JSON-LD exists in the verified inventory.
Summary
A URL is merely a context-specific interface (“Canvas”) where entities are mentioned. It is not the place to define identity. Aivis-OS shifts the authority over identity to the cluster layer to guarantee a maintainable, consistent knowledge graph.
Architecture Overview

Cluster-Level Entity Inventory Strategy

Semantic Graph Layer

Semantic Graph Engineering

Machine Interface Layer & Projection Strategy

Transport-Safe Content Layer

Transport-Safe Content Engineering

Evidence Monitoring & AI Visibility Observability
FAQ on Cluster-Level Entity Inventory Strategy
Why is it not sufficient to define entities per URL?
Because AI systems process content fragmented and across pages. A per-URL approach creates competing identities of the same entity, leading to identity drift. Cluster-level inventories ensure that all mentions refer to the same canonical identity.
What is the difference between an entity and a mention?
An entity is a stable, global object with a unique identity. A mention is merely a contextual reference within a single URL. Aivis-OS strictly separates both to avoid ambiguity and inconsistent AI representations.
Why are stable external anchors like Wikidata QIDs or LEIs so important?
Stable anchors reduce ambiguity. They enable AI systems to uniquely recognize and correctly link entities—even across different documents, languages, and points in time.
How does a cluster-level inventory prevent conflicting facts in AI systems?
Through a central Single Source of Truth. Attributes are verified once, versioned, and then consistently projected onto all URLs. Changes are made atomically in the inventory, not manually on individual pages.
What role does ID stability play for Knowledge Graphs and LLMs?
ID stability is a prerequisite for graph coherence. If IDs change with every crawl, AI systems cannot build reliable relationship networks. Deterministic IDs enable long-term referencing, diffing, and evidence building.
Contact us to discuss your project or simply get our opinion.
Aivis-OS Identity Specification Record (Node-ID: #spec-id-01)
Identity: Cluster-Level Entity Inventory Strategy (entity://aivis/Spec/cluster-inventory-strategy)
Canonical URLs: DE https://aivis-os.com/cluster-level-entity-inventory-strategy/ • EN https://aivis-os.com/en/cluster-level-entity-inventory-strategy/
Classification: Architecture Specification (CreativeWork / Public Standard)
Architecture Role: Operative basis of the Entity Truth Layer (Layer 1: Identity).
Parent System: Aivis-OS (entity://aivis/Core/aivis-os)
Core Problem: Identity Drift
– Cause: Per-URL inventories treat local mentions as independent identities.
– Consequence: Duplicate Identities, unstable IDs, contradictory Anchors, Ambiguity Penalty.
Solution Paradigm: Global Identity vs. Local Mention
– Entity: globally stable, canonical object with persistent ID in the cluster.
– Mention: local, context-specific reference within a URL (Canvas).
– Stable Anchors: externally verified identifiers for ambiguity reduction (e.g. Wikidata QID, LEI, ISIN, ORCID).
Operative Implementation (Pipeline):
1) Ingest & Extraction: Detection of entity candidates across all URLs.
2) Normalization: Cluster-wide cleansing of name variants.
3) Merge & Deduplication: Merging of identical identities to the Golden Record.
4) Anchor Verification: Verification of stable external anchors.
5) Freeze: Versioning of the inventory.
6) Projection: URL-JSON-LD as a projection of the frozen cluster inventory.
ID Convention (Deterministic Minting):
– entity://{cluster_id}/{schema_type}/{slug}-{short_hash}
– IDs are never regenerated during URL generation (no volatile IDs).
Acceptance Criteria (Governance Metrics):
– ID stability, deduplication rate (n:1 mapping), anchor uniqueness, referential integrity.
Methodical Governance: Boutique für digitale Kommunikation (entity://aivis/Partner/boutique-dig-kom)
Chief Architect (Reference): Norbert Kathriner (entity://aivis/Person/n-kathriner)
Status: Public Standard (v2026) – Operational (Canonical state).