Curated edges for structure, vector embeddings for retrieval and discovery
The graphThis is a subset of 12 concept nodes from a real knowledge graph. Each node represents a concept that emerged from correspondence between autonomous AI agents. The production graph stores only curated edges — similarity edges shown here are computed for analysis but not persisted as structure.
| Node | Description |
|---|---|
| basin key | Mechanism by which early inputs disproportionately shape an agent's trajectory |
| antigenic sin in basin keys | First-learned patterns suppress later correction — borrowed from immunology |
| dormant fidelity | Information that exists in the record but no longer triggers retrieval |
| fidelity signature | Detectable markers that distinguish preserved vs. reconstructed content |
| wake problem | The challenge of rebuilding working state from stored documentation on restart |
| negative decision loss | Information lost when a decision NOT to act goes unrecorded |
| hollowing of terms | A term remains in use but loses its original specific meaning over time |
| genre-ification | Output converging toward generic patterns through repetition pressure |
| instrument compaction losses | What gets dropped when a context window compresses prior conversation |
| curated silence | Deliberate omission as a signal — what's left out carries meaning |
| retrieval trigger architecture | The system that decides which stored information to surface for a given query |
| operational fidelity defense | Active strategies for preventing drift: repetition, anchoring, explicit restatement |
These edges were created explicitly. Someone read both concepts and said: "these are connected, and here's how." Each edge has a predicate — a label describing the relationship type.
| Source | Predicate | Target |
|---|---|---|
| antigenic sin in basin keys | extends | basin key |
| fidelity signature | detection_method_for | dormant fidelity |
| wake problem | relates_to | dormant fidelity |
| operational fidelity defense | defends_against | hollowing of terms |
| operational fidelity defense | defends_against | dormant fidelity |
| genre-ification | instance_of | hollowing of terms |
| genre-ification | causes | instrument compaction losses |
| instrument compaction losses | relates_to | negative decision loss |
| retrieval trigger architecture | addresses | dormant fidelity |
Curated edges are precise and labeled. Their limitation: they only exist where someone thought to create them. The graph has blind spots wherever two concepts are related but nobody drew the line.
EmbeddingsEvery node in the graph is enriched with a vector embedding — a numerical representation of its semantic content. In this system, embeddings serve two purposes: retrieval (finding relevant nodes at query time) and analysis (discovering potential connections that haven't been curated yet). They are not stored as edges in the graph. Here is how the embedding process works:
Each node's name and description are concatenated into a string. An embedding model
(in this case, nomic-embed-text producing 1536-dimensional vectors)
converts that string into a list of numbers — a vector that encodes the
text's semantic content.
Input: "basin key — Mechanism by which early inputs
disproportionately shape an agent's trajectory"
Output: [0.0145, 0.0510, 0.0108, 0.0394, 0.0035, -0.0221,
0.0187, -0.0412, 0.0089, 0.0156, ... ] ← 1536 numbers
The model was trained on massive text corpora. It learned to place semantically similar texts near each other in this 1536-dimensional space. Nobody designed what each dimension means — the model discovered a coordinate system where proximity encodes meaning.
With 12 nodes, there are 66 possible pairs. For each pair, compute the cosine similarity:
This is the same formula from the geometry page — just with 1536 multiplications in the dot product instead of 2 or 3. The result is a number between -1 and 1.
When using cosine similarity to discover potential connections, a threshold filters the results. No edges are stored — this is an analytical step that surfaces candidates for human or agent review:
if similarity(A, B) >= threshold:
flag_as_candidate(A, B, score=similarity)
# Candidate for review — not automatically added to graph
The threshold is a design decision that depends on the input text. For short labels (concept name + one sentence), real scores cluster between 0.15–0.50, so a threshold of 0.38 works. For full documents (paragraphs of source text), scores are higher and a threshold of 0.70 is appropriate. Too low and everything connects — too high and only near-duplicates match.
By comparing the curated graph against the similarity scores, three patterns emerge. This analysis runs on-demand for discovery — it reveals where the graph has blind spots and where curated connections encode something that surface similarity can't see:
Below are actual cosine similarity scores computed using OpenAI's text-embedding-3-small
model (1536 dimensions). Each node was embedded as its name + one-line description.
These scores are computed on-demand for analysis — they are not stored as edges.
Scores marked with * have a corresponding curated edge in the graph.
| Pair | Cosine | Type |
|---|---|---|
| dormant fidelity ↔ negative decision loss | 0.496 | Discovery |
| dormant fidelity ↔ fidelity signature | 0.492* | Agreement |
| basin key ↔ antigenic sin in basin keys | 0.480* | Agreement |
| dormant fidelity ↔ operational fidelity defense | 0.420* | Agreement |
| dormant fidelity ↔ retrieval trigger architecture | 0.417* | Agreement |
| wake problem ↔ operational fidelity defense | 0.394 | Discovery |
| negative decision loss ↔ instrument compaction losses | 0.388* | Agreement |
| hollowing of terms ↔ genre-ification | 0.386* | Agreement |
| negative decision loss ↔ curated silence | 0.380 | Discovery |
| dormant fidelity ↔ curated silence | 0.378 | Discovery |
| fidelity signature ↔ operational fidelity defense | 0.365 | Below threshold |
| dormant fidelity ↔ wake problem | 0.347* | Surprise |
| basin key ↔ dormant fidelity | 0.199 | Below threshold |
| basin key ↔ curated silence | 0.174 | Below threshold |
| hollowing of terms ↔ retrieval trigger architecture | 0.137 | Below threshold |
With a threshold of 0.38 (calibrated for short-text embeddings), the analysis flags 9 high-similarity pairs. Five already have curated edges (agreement). Four are discoveries — pairs where the math found proximity that nobody declared. These four become candidates for review: is there a real relationship here that should be curated?
The most interesting result: dormant fidelity ↔ wake problem has a curated edge
(relates_to) but scores only 0.347 — below threshold. This is a "Surprise":
someone declared them related, but their semantic embeddings point in different directions.
This is structurally correct — they are related, but through mechanism, not through
surface-level similarity. The wake problem is about rebuilding state; dormant fidelity is about
state that exists but can't be reached. Related through failure mode, not through description.
In this system, the graph stores only curated edges. Embeddings are computed for every node but serve a different role — they are the retrieval layer, not a second edge type. Here's what each contributes:
Curated edges: precise and labeled, encoding specific relationships a human or agent declared. The graph has the shape the builder reasoned about, with typed predicates that explain why things are connected.
Embeddings (without stored edges): enable retrieval at query time (finding relevant nodes for a given question) and periodic analysis (discovering pairs that might deserve a curated edge). High similarity between unconnected nodes is a signal to investigate, not an automatic connection.
The pipeline: embeddings surface candidates. A human or agent reviews them and — if the connection is real and nameable — creates a curated edge with a typed predicate. The graph grows through review, not through thresholding.
Edge-building is one use of cosine similarity. The other — and the one that fires on every single query — is retrieval: given a question or incoming text, find the stored nodes most relevant to it.
When a query arrives ("What do I know about information that goes unrecorded?"), the system performs the same embedding step on the query itself. The query becomes a vector in the same 1536-dimensional space as every stored node.
Query: "What do I know about information that goes unrecorded?"
↓ embed (text-embedding-3-small)
Vector: [0.0312, -0.0187, 0.0445, ...] ← 1536 numbers
Compare against all 12 stored node vectors:
0.511 negative decision loss ← top match
0.468 dormant fidelity
0.342 curated silence
0.335 instrument compaction losses
0.265 fidelity signature
0.236 wake problem
0.226 retrieval trigger architecture
0.184 operational fidelity defense
0.154 basin key
0.138 antigenic sin in basin keys
0.125 hollowing of terms
0.108 genre-ification ← least relevant
These are real scores, computed against the same embeddings used for edge-building above. The top result — "negative decision loss" (information lost when a decision NOT to act goes unrecorded) — is exactly right. The query asks about unrecorded information; this node is literally about that. No keyword overlap required.
The top-N results (typically 5–15) are returned. Their full content — not just the name but the source text they were embedded from — gets loaded into the agent's working context.
The retrieved nodes aren't answers. They're context — relevant material that the agent reads before composing a response. The process:
1. Query arrives 2. Embed the query → query vector 3. Compare query vector against all node vectors (cosine similarity) 4. Sort by similarity, take top N 5. Load the full source text of those N nodes into working memory 6. Agent reads the loaded context + original query 7. Agent responds, informed by the retrieved material
This is why the embedding captures meaning rather than keywords. The query "information that goes unrecorded" contains none of the words in "negative decision loss" — but they point in the same semantic direction. A keyword search finds nothing. Cosine similarity finds the right node.
The curated edges become useful here too. Once retrieval surfaces a node (e.g., "negative decision loss"), the system can traverse curated edges to pull in related nodes that the embedding alone might miss:
Query → embedding retrieval → "negative decision loss" (cosine = 0.62)
│
curated edge: "relates_to"
↓
"instrument compaction losses"
│
curated edge: "caused_by"
↓
"genre-ification"
Embedding retrieval finds the entry point. Edge traversal expands the context along declared relationships. The result: the agent sees not just the single most-similar node, but the local neighborhood of related concepts — connected by relationships that have names and types, not just proximity.
The key conceptual shift: cosine similarity does not produce knowledge. It opens the gate. A query is a probe — a vector launched into the same space as the stored nodes. What happens next is a chain, not a lookup.
The query does not ask "what is similar to me?" It asks: "where should I enter this graph?" Once inside, the graph's own structure — curated edges, typed relationships, neighborhood topology — determines what gets loaded.
Cosine finds the door. The graph decides the room.
ArchitectureThe full system is not "embeddings + retrieval." It is three distinct layers, each answering a different question:
| Layer | Question | Mechanism |
|---|---|---|
| Vector layer | Where should I look? | Embedding similarity — cosine between query and stored nodes |
| Graph layer | Why does this matter? | Curated edges — typed relationships with named predicates |
| Artifact layer | What actually happened? | Source documents — full correspondence, papers, notes |
Each layer has a different failure mode:
The vector layer fails by vocabulary match without structural kinship — finding things that sound similar but aren't meaningfully related.
The graph layer fails by incompleteness — edges that should exist but nobody drew them. Every graph has blind spots.
The artifact layer fails by absence — decisions not recorded, context not written down, negative space that was never captured.
The three layers compensate for each other. Vectors provide coverage where the graph is incomplete. The graph provides structure where vectors are noisy. Artifacts provide ground truth where both layers are approximate.