Read the source code of your memory.
Architecture
Three storage layers, the search pipeline, and the knowledge integration engine.
Orion is a three-layer knowledge system served through a single FastAPI process that exposes both REST and MCP interfaces.
System overview
Clients (Claude Code, Cursor, CLI, Web)
│
├── MCP /mcp (16 tools)
└── REST /api/v1/* (40+ endpoints)
│
┌─────▼──────────────────────────────┐
│ orion-api (single FastAPI process) │
│ │
│ Service Layer │
│ search · stardust · graph · sun │
│ orientation · calibration · audit │
│ contradiction · synthesis · import │
│ planet_assignment · agent_identity │
│ │
│ ┌─────────┐ ┌────────┐ ┌────────┐ │
│ │ Redis │ │ChromaDB│ │Postgres│ │
│ │ (cache) │ │(vectors│ │(struct)│ │
│ └─────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────┘
Storage layers
Redis — hot cache
Working memory. Recently written/accessed stardust, active sessions, computed results.
- Per-region TTLs (empathetic: 1h → strategic: 7d)
- Session tracking with 5-minute idle timeout
- Dashboard and strength score caching (15 min / 1 hr)
- Synthesis result caching (30 min)
ChromaDB — semantic vectors
Embedding-based similarity search. Every stardust record is embedded and stored in a collection partitioned by galaxy_id × region.
- Collections:
orion_{galaxy_id}_{region}(7 per galaxy) - Distance metric: Cosine (HNSW index)
- Embedding providers: Ollama (
nomic-embed-text) or Google (text-embedding-004)
PostgreSQL — structural spine
All relational data: Galaxy hierarchy, knowledge graph, agent identities, audit logs, user accounts.
- 16 Alembic migrations
- Async via
asyncpg(PostgreSQL) oraiosqlite(SQLite for local dev) - JSONB columns with GIN indexes for tags and metadata
- Connection pooling: 10 base + 20 overflow
Search pipeline
Search uses Reciprocal Rank Fusion (RRF) to combine three signal sources:
Query
├─→ Redis cache scan (keyword substring match)
├─→ ChromaDB semantic search (per planet × per region)
│ ├─→ Semantic ranking (cosine similarity)
│ ├─→ Recency ranking (similarity × 1/(1 + days_old))
│ └─→ Confidence ranking (similarity × stored_confidence)
└─→ RRF fusion: score(d) = Σ 1/(k + rank + 1), k=60
└─→ Deduplicate → Enrich from Postgres → Return
RRF operates on rank positions, not raw scores — so it fuses rankings from completely different scoring systems without normalization. See the RRF blog post for benchmarks and tuning details.
Knowledge integration engine
Every brain.think call triggers an 8-step pipeline:
brain.think("FastAPI replaced Flask for the API layer")
│
├─→ 0. Route to Planet (4 strategies + inbox fallback)
├─→ 1. Write stardust record to Postgres
├─→ 2. Embed content → upsert to ChromaDB
├─→ 3. Extract entities (regex-based, zero LLM)
├─→ 4. Extract typed relationships (USES, REPLACES, DEPENDS_ON, ...)
├─→ 5. Process supersession chains (archive old, create SUPERSEDES edges)
├─→ 6. Update entity backlinks
├─→ 7. Update agent expertise profile
└─→ 8. Log integration event to Nebula
All steps are zero-LLM by default — entity and relationship extraction use regex patterns. The optional LLM pass adds ~130ms but pushes entity coverage from ~60% to ~90%.
Total pipeline latency: ~200ms on a laptop (with LLM), ~70ms without.
Data model
Galaxy
├── Sun (7 sections)
├── Agent Identities
├── Knowledge Graph (entities + relationships)
├── Users (auth, roles, planet assignments)
└── Planets
└── Biomes (SEED → ACTIVE → MATURE → DORMANT → ARCHIVED)
├── Stardust (content, region, gravity, confidence)
└── Entities (name, type, tier 1–3)
Key tables: galaxies, planets, biomes, stardust, entities, entity_relationships, entity_backlinks, agent_identities, routing_log, graph_path_cache. See the source code for full schemas.