Documentation

Read the source code of your memory.

Docs / core-concepts / architecture

Architecture

Three storage layers, the search pipeline, and the knowledge integration engine.

Orion is a three-layer knowledge system served through a single FastAPI process that exposes both REST and MCP interfaces.

System overview

Clients (Claude Code, Cursor, CLI, Web)
  │
  ├── MCP /mcp (16 tools)
  └── REST /api/v1/* (40+ endpoints)
        │
  ┌─────▼──────────────────────────────┐
  │  orion-api (single FastAPI process) │
  │                                     │
  │  Service Layer                      │
  │  search · stardust · graph · sun    │
  │  orientation · calibration · audit  │
  │  contradiction · synthesis · import │
  │  planet_assignment · agent_identity │
  │                                     │
  │  ┌─────────┐ ┌────────┐ ┌────────┐ │
  │  │ Redis   │ │ChromaDB│ │Postgres│ │
  │  │ (cache) │ │(vectors│ │(struct)│ │
  │  └─────────┘ └────────┘ └────────┘ │
  └─────────────────────────────────────┘

Storage layers

Redis — hot cache

Working memory. Recently written/accessed stardust, active sessions, computed results.

  • Per-region TTLs (empathetic: 1h → strategic: 7d)
  • Session tracking with 5-minute idle timeout
  • Dashboard and strength score caching (15 min / 1 hr)
  • Synthesis result caching (30 min)

ChromaDB — semantic vectors

Embedding-based similarity search. Every stardust record is embedded and stored in a collection partitioned by galaxy_id × region.

  • Collections: orion_{galaxy_id}_{region} (7 per galaxy)
  • Distance metric: Cosine (HNSW index)
  • Embedding providers: Ollama (nomic-embed-text) or Google (text-embedding-004)

PostgreSQL — structural spine

All relational data: Galaxy hierarchy, knowledge graph, agent identities, audit logs, user accounts.

  • 16 Alembic migrations
  • Async via asyncpg (PostgreSQL) or aiosqlite (SQLite for local dev)
  • JSONB columns with GIN indexes for tags and metadata
  • Connection pooling: 10 base + 20 overflow

Search pipeline

Search uses Reciprocal Rank Fusion (RRF) to combine three signal sources:

Query
  ├─→ Redis cache scan (keyword substring match)
  ├─→ ChromaDB semantic search (per planet × per region)
  │     ├─→ Semantic ranking (cosine similarity)
  │     ├─→ Recency ranking (similarity × 1/(1 + days_old))
  │     └─→ Confidence ranking (similarity × stored_confidence)
  └─→ RRF fusion: score(d) = Σ 1/(k + rank + 1), k=60
        └─→ Deduplicate → Enrich from Postgres → Return

RRF operates on rank positions, not raw scores — so it fuses rankings from completely different scoring systems without normalization. See the RRF blog post for benchmarks and tuning details.

Knowledge integration engine

Every brain.think call triggers an 8-step pipeline:

brain.think("FastAPI replaced Flask for the API layer")
  │
  ├─→ 0. Route to Planet (4 strategies + inbox fallback)
  ├─→ 1. Write stardust record to Postgres
  ├─→ 2. Embed content → upsert to ChromaDB
  ├─→ 3. Extract entities (regex-based, zero LLM)
  ├─→ 4. Extract typed relationships (USES, REPLACES, DEPENDS_ON, ...)
  ├─→ 5. Process supersession chains (archive old, create SUPERSEDES edges)
  ├─→ 6. Update entity backlinks
  ├─→ 7. Update agent expertise profile
  └─→ 8. Log integration event to Nebula

All steps are zero-LLM by default — entity and relationship extraction use regex patterns. The optional LLM pass adds ~130ms but pushes entity coverage from ~60% to ~90%.

Total pipeline latency: ~200ms on a laptop (with LLM), ~70ms without.

Data model

Galaxy
├── Sun (7 sections)
├── Agent Identities
├── Knowledge Graph (entities + relationships)
├── Users (auth, roles, planet assignments)
└── Planets
      └── Biomes (SEED → ACTIVE → MATURE → DORMANT → ARCHIVED)
            ├── Stardust (content, region, gravity, confidence)
            └── Entities (name, type, tier 1–3)

Key tables: galaxies, planets, biomes, stardust, entities, entity_relationships, entity_backlinks, agent_identities, routing_log, graph_path_cache. See the source code for full schemas.