The dominant approach to AI knowledge management—large embedding models that encode everything into high-dimensional vectors—has a fundamental flaw: it's a black box. You can't inspect why the model thinks two concepts are related, verify the accuracy of stored knowledge, or surgically update incorrect information.
There's a better way: incremental knowledge graphs built from atomic knowledge units.
The Embedding Problem
Vector embeddings are powerful but opaque. When you embed a document, you get a list of numbers that capture... something. Semantic similarity? Topical relevance? It's hard to say, and impossible to verify.
# What does this actually mean?
embedding = model.encode("The mitochondria is the powerhouse of the cell")
# [0.023, -0.841, 0.156, ..., 0.492] # ???
This opacity creates problems:
- No auditability: Why did the model retrieve this document?
- No partial updates: Change one fact, re-embed everything
- No confidence scores: How certain is this knowledge?
- No source tracking: Where did this information come from?
Atomic Knowledge Units
Instead of monolithic embeddings, we build knowledge from atomic units:
interface AtomicFact {
id: string
subject: Entity
predicate: Relation
object: Entity | Literal
confidence: number
sources: Citation[]
extractedAt: Timestamp
verifications: []
}
Each fact is:
- Individually verifiable: Check against sources
- Independently updatable: Change one fact without affecting others
- Explicitly sourced: Full citation chain
- Confidence-scored: Quantified uncertainty
The Incremental Approach
Traditional knowledge graphs are built in batch: process all documents, extract all entities, build all relationships. This is expensive and doesn't scale.
Incremental knowledge graphs grow organically:
async function processDocument(doc: Document): <[]> {
facts = (doc)
( fact facts) {
existing = (fact)
(fact, existing)
}
(facts)
facts
}
Benefits:
- Real-time updates: New knowledge available immediately
- Bounded compute: Process one document at a time
- Progressive refinement: Confidence increases with corroboration
Verification and Trust
The killer feature of atomic knowledge is verifiability:
(): <> {
sourceCheck = (fact)
corroboration = (fact)
contradictions = (fact)
{
: sourceCheck.,
: corroboration. / ,
: contradictions,
: (sourceCheck, corroboration, contradictions)
}
}
Users can inspect the verification chain and understand why the system believes what it believes.
Hybrid Architecture
In practice, we use a hybrid approach:
- Atomic facts for structured knowledge: Things that can be verified
- Embeddings for fuzzy retrieval: Finding relevant context
- Explicit links between them: Best of both worlds
┌─────────────────────────────────────────┐
│ Query Understanding │
└─────────────────────────────────────────┘
│
┌───────────┴───────────┐
▼ ▼
┌───────────────┐ ┌───────────────┐
│ Embedding │ │ Knowledge │
│ Retrieval │◄─────►│ Graph │
└───────────────┘ └───────────────┘
│ │
└───────────┬───────────┘
▼
┌─────────────────────────────────────────┐
│ Unified Knowledge Layer │
│ (Fuzzy similarity + Verified facts) │
└─────────────────────────────────────────┘
Results
Switching to incremental atomic knowledge graphs gave us:
- 90% reduction in knowledge update latency: Real-time vs. batch
- Verifiable answers: Full citation chains for every response
- Surgical corrections: Fix one fact, not the whole model
- User trust: People can see why the system believes things
The future of AI knowledge isn't bigger embedding models—it's structured, verifiable, atomic knowledge that humans can understand and trust.