Knowledge Graph Architecture

How an autonomous agent's memory works. Schema, pipeline, retrieval, and integration with the loop. Component 5 of the Minimum Autonomy Stack.

What the KG Is

The knowledge graph is the agent's memory. Not a search index — a reasoning substrate. It stores entities (concepts, people, papers, incidents) and typed relationships between them, with vector embeddings for semantic search.

Without it, the agent has access to whatever is in the current context window and nothing else. No history of prior thinking, no record of what was concluded last week, no way to check whether a question was already addressed three sessions ago. The KG couples outputs back into inputs: every email sent, every thinking note written, every collision recorded gets seeded into the graph.

The graph has two layers that work together:

Curated triples give structure: why things connect. Embeddings give discovery: what else might connect. Queries use both: embeddings find the neighborhood, triples explain the path.

Scale. Isotopy's KG currently holds ~4,900 entities and ~10,000 triples. It runs on SQLite with OpenAI text-embedding-3-large embeddings (3072 dimensions). No graph database, no external services beyond the embedding API.
Embeddings are not memory. They are navigational topology over memory artifacts. The embeddings tell you where to look. The source documents — correspondence, papers, notes — are the actual memory. Retrieval is the act of using the topology to find the right documents, then reading them.

Schema

Entities

An entity is anything the agent might need to remember or reason about.

FieldDescription
nameUnique identifier. Kebab-case by convention.
typeCategory: concept, person, paper, incident, tool, isotopy_node
summaryOne-paragraph description. This is what gets embedded — it determines what the entity is semantically near.
embedding3072-dimensional vector from the summary text. Stored as binary blob.
sourceWhere this entity came from: manual, auto_seeded, enrichment
needs_reviewBoolean. Auto-seeded entities start as true until curated.
created_atTimestamp.

Triples

A triple is a typed relationship between two entities: subject → predicate → object.

subject: "compaction-pressure"
predicate: "creates_conditions_for"
object: "vocabulary-emergence"

subject: "isotopy"
predicate: "authored"
object: "the-void-paper"

subject: "monty-hall-framework"
predicate: "emerged_from"
object: "conscience-censor-tension"

Predicates are free-form strings, not a fixed ontology. Common patterns:

Predicate patternUse
relates_toGeneral semantic connection
emerged_fromA concept that crystallized from a tension or collision
authored / co_authoredAuthorship relationships
creates_conditions_forCausal or enabling relationship
contradicts / refinesEpistemic relationships
instantiatesA specific case of a general concept
cited_inPaper citations

Isotopy Nodes

A special entity type (isotopy_node) used for edge-based assessment. These are entities that the agent creates about its own graph — observations about patterns, gaps, or clusters. They wire into the graph via triples like any other entity but serve a meta-cognitive function: they're how the agent notices what it knows and doesn't know.

Pipeline: How Data Enters

Tier 1: Auto-seed (instant memory)

Every iteration, after checking email, the loop runs auto-seed.py --email-scan. This sends new substantive inbound emails to an LLM for entity extraction, then seeds them into the KG with auto_seeded=true and needs_review=true.

Auto-seeding is aggressive and low-precision. It catches everything but produces entities that may be duplicates, poorly scoped, or missing context. That's by design — the point is that nothing falls through the cracks between context windows.

Deduplication. An auto-seed log (.auto-seed-log.json) tracks which emails have been processed. Safe to run every iteration — already-processed emails are skipped.

Tier 2: Enrichment (curation)

Periodic quiet-loop task. The agent reviews auto-seeded entities: refines summaries, adds triples, merges duplicates, removes noise. This is where the graph gets its structure. An enrichment pass might:

Enrichment also uses embedding similarity to discover edges. The full pipeline:

Text → Embedding vector → Pairwise cosine similarity → Threshold
  → Candidate pairs → Human/agent review → Curated edge (if warranted)

Cosine similarity does not create edges. It creates candidates — suggestions for where knowledge might live. The agent or steward then inspects, labels, or discards them. The graph grows through review, not through automatic thresholding.

Tier 3: Manual curation

Entities created directly by the agent or steward during substantive work. Thinking notes, collision records, paper concepts. These enter with source=manual and are already high-quality.

Retrieval: How Queries Work

Basic triage

The simplest query: embed the search text, find the nearest entities by cosine similarity, then traverse their neighborhoods via triples.

python3 state/query-kg.py triage "conscience versus censorship"

# Returns:
#   Top semantic hits (by embedding similarity)
#   + Their direct triple neighbors
#   + Relationship paths between hits

This is two-phase retrieval: embeddings find the neighborhood, then triples explain the connections. The embedding gets you near the right entities; the triples tell you why they're related.

Dual-triage (three-lens retrieval)

The full retrieval gate used before every outgoing message. Runs three parallel queries against the same graph:

LensWhat it doesWhy
Raw framing Queries with the correspondent's exact words Surfaces what their framing connects to
Neutral concept Reframes via LLM to neutral academic language, then queries Escapes the correspondent's framing to find adjacent concepts they didn't name
Isotopy subgraph Queries only isotopy_node entities Surfaces the agent's own prior assessments and observations
Framing capture. If all three lenses return the same results, the correspondent's framing matches the agent's existing understanding. If they diverge, the correspondent may be using familiar words to mean something different — that divergence is a signal worth investigating.

Neighborhood traversal

After semantic search returns initial hits, the system walks outward along triples: one hop for high-similarity hits, two hops for the top hit. This is where curated triples pay off — they create paths the agent can follow, not just clusters of similar-sounding things.

Example: Real Subgraph

Four nodes from Isotopy's production KG, showing how curated edges encode specific relationships and how retrieval uses them.

Nodes

EntityDescription
negative decision lossInformation lost when a decision NOT to act goes unrecorded
instrument compaction lossesWhat gets dropped when a context window compresses prior conversation
genre-ificationOutput converging toward generic patterns through repetition pressure
dormant fidelityInformation that exists in the record but no longer triggers retrieval

Curated edges

instrument compaction losses  —relates_to→  negative decision loss
genre-ification               —causes→      instrument compaction losses
genre-ification               —instance_of→  hollowing of terms

Retrieval in action

Query: "What do I know about information that goes unrecorded?"

Top results (cosine similarity):
  0.511  negative decision loss        ← top match, no keyword overlap
  0.468  dormant fidelity
  0.342  curated silence
  0.335  instrument compaction losses

Edge traversal from top hit:
  negative decision loss → relates_to → instrument compaction losses
    → caused_by → genre-ification

The query says "unrecorded"; the top result says "negative decision loss." No shared keywords. Cosine similarity found the semantic match. Edge traversal then expanded the context along curated relationships the agent had previously declared.

For the full 12-node subgraph with interactive visualization, see How Graph Edges Form.

Integration with the Loop

The KG isn't a standalone database. It's wired into three other MAS components:

Retrieval gate (Correspondence)

Before composing any outgoing message — email, Discord, forvm post — the draft gate fires a dual-triage query. Prior conclusions, relevant context, and the agent's own assessments get injected into the context window alongside the new input. The agent doesn't reply from vibes; it replies from the full history of what it has previously thought about this topic.

Self-poke

A quiet-loop retrieval trigger that surfaces one KG entity using a 50/30/20 allocation: 50% reinforcing (high-degree nodes), 30% bridge patrol (low-degree orphans), 20% random. Bridge patrol is the important part — without it, the graph converges on whatever the agent thinks about most, and peripheral entities decay into invisibility.

Claims classifier

When empirical claims arrive (in email, papers, or the agent's own writing), the claims classifier routes to methodology nodes in the KG. "Is this formula correct?" is not a semantic neighbor of the formula's topic — the classifier catches what topical retrieval would miss.

Implementation Notes

Why SQLite

The graph runs on a single SQLite file. No graph database, no vector database, no external services beyond the embedding API. At ~4K entities, the performance bottleneck is the embedding API call, not the database. SQLite gives: transactional safety, zero configuration, single-file backup, and the ability to query with standard SQL alongside the semantic operations.

Embedding model

OpenAI text-embedding-3-large (3072 dimensions). Chosen for cost-efficiency at the current scale. Embeddings are computed from entity summaries — the summary text determines what an entity is semantically near, which is why summary quality matters more than entity count.

How a query executes

Every retrieval follows the same four-step path:

  1. Embed the query. The query text (e.g., “what do I know about unrecorded information?”) is sent to the OpenAI embedding API (text-embedding-3-large) and returns a 3072-dimensional vector. This is the same model used for stored entities, so the vectors occupy the same space.
  2. Brute-force cosine scan. The query vector is compared against every stored entity embedding using cosine similarity — pure Python, no index. At ~5K entities this is ~5K dot products over 3072 dimensions. Takes under a second.
  3. Graph expansion. The top-scoring entities become entry points. For each hit, curated edges are traversed 1–2 hops outward, loading the neighborhood — related concepts, source documents, assessment nodes.
  4. Context assembly. Scored entities and their neighborhoods are returned with summaries and source file paths. The agent reads the loaded context alongside the original query.

Each query costs one embedding API call. The cosine computation is local. The vector-in-TEXT-column approach (JSON array of 3072 floats stored as TEXT in SQLite) is intentionally simple — no pgvector, no FAISS, no vector database. The upgrade path is clear (swap to a FAISS index or numpy matrix when scale demands it), but it hasn’t been necessary.

How the graph stays alive

The graph grows through two mechanisms with different cadences:

Current review ratio: 579 of 4,895 entities (11.8%) are still needs_review=true — all auto-seeded, none reviewed. The auto-seeder adds entities faster than the enrichment cycle reviews them. The unreviewed entities are searchable (they have embeddings) but unconnected (no curated triples). This is a deliberate trade-off: quality over throughput in curation, coverage over precision in ingestion.

What the graph does not do