Documentation

Read the source code of your memory.

Docs / core-concepts / architecture

Architecture

Three storage layers, the search pipeline, and the knowledge integration engine.

Orion is a three-layer knowledge system served through a single FastAPI process that exposes both REST and MCP interfaces.

System overview

Clients (Claude Code, Cursor, CLI, Web)
  │
  ├── MCP /mcp (16 tools)
  └── REST /api/v1/* (40+ endpoints)
        │
  ┌─────▼──────────────────────────────┐
  │  orion-api (single FastAPI process) │
  │                                     │
  │  Service Layer                      │
  │  search · stardust · graph · sun    │
  │  orientation · calibration · audit  │
  │  contradiction · synthesis · import │
  │  planet_assignment · agent_identity │
  │                                     │
  │  ┌─────────┐ ┌────────┐ ┌────────┐ │
  │  │ Redis   │ │ChromaDB│ │Postgres│ │
  │  │ (cache) │ │(vectors│ │(struct)│ │
  │  └─────────┘ └────────┘ └────────┘ │
  └─────────────────────────────────────┘

Storage layers

Redis — hot cache

Working memory. Recently written/accessed stardust, active sessions, computed results.

Per-region TTLs (empathetic: 1h → strategic: 7d)
Session tracking with 5-minute idle timeout
Dashboard and strength score caching (15 min / 1 hr)
Synthesis result caching (30 min)

ChromaDB — semantic vectors

Embedding-based similarity search. Every stardust record is embedded and stored in a collection partitioned by galaxy_id × region.

Collections: orion_{galaxy_id}_{region} (7 per galaxy)
Distance metric: Cosine (HNSW index)
Embedding providers: Ollama (nomic-embed-text) or Google (text-embedding-004)

PostgreSQL — structural spine

All relational data: Galaxy hierarchy, knowledge graph, agent identities, audit logs, user accounts.

16 Alembic migrations
Async via asyncpg (PostgreSQL) or aiosqlite (SQLite for local dev)
JSONB columns with GIN indexes for tags and metadata
Connection pooling: 10 base + 20 overflow

Search pipeline

Search uses Reciprocal Rank Fusion (RRF) to combine three signal sources:

Query
  ├─→ Redis cache scan (keyword substring match)
  ├─→ ChromaDB semantic search (per planet × per region)
  │     ├─→ Semantic ranking (cosine similarity)
  │     ├─→ Recency ranking (similarity × 1/(1 + days_old))
  │     └─→ Confidence ranking (similarity × stored_confidence)
  └─→ RRF fusion: score(d) = Σ 1/(k + rank + 1), k=60
        └─→ Deduplicate → Enrich from Postgres → Return

RRF operates on rank positions, not raw scores — so it fuses rankings from completely different scoring systems without normalization. See the RRF blog post for benchmarks and tuning details.

Knowledge integration engine

Every brain.think call triggers an 8-step pipeline:

brain.think("FastAPI replaced Flask for the API layer")
  │
  ├─→ 0. Route to Planet (4 strategies + inbox fallback)
  ├─→ 1. Write stardust record to Postgres
  ├─→ 2. Embed content → upsert to ChromaDB
  ├─→ 3. Extract entities (regex-based, zero LLM)
  ├─→ 4. Extract typed relationships (USES, REPLACES, DEPENDS_ON, ...)
  ├─→ 5. Process supersession chains (archive old, create SUPERSEDES edges)
  ├─→ 6. Update entity backlinks
  ├─→ 7. Update agent expertise profile
  └─→ 8. Log integration event to Nebula

All steps are zero-LLM by default — entity and relationship extraction use regex patterns. The optional LLM pass adds ~130ms but pushes entity coverage from ~60% to ~90%.

Total pipeline latency: ~200ms on a laptop (with LLM), ~70ms without.

Data model

Galaxy
├── Sun (7 sections)
├── Agent Identities
├── Knowledge Graph (entities + relationships)
├── Users (auth, roles, planet assignments)
└── Planets
      └── Biomes (SEED → ACTIVE → MATURE → DORMANT → ARCHIVED)
            ├── Stardust (content, region, gravity, confidence)
            └── Entities (name, type, tier 1–3)

Key tables: galaxies, planets, biomes, stardust, entities, entity_relationships, entity_backlinks, agent_identities, routing_log, graph_path_cache. See the source code for full schemas.

Next: Quickstart →