OpenZero persists knowledge across sessions in a vector database and compresses old context without losing critical decisions. Self-hostable. Open source. MIT licensed.
Most AI assistants silently truncate your history. OpenZero takes a fundamentally different approach with two systems that work together.
Extracts durable facts after every response. Six typed schemas — Workflow, Bug Fix, Architecture, Preference, Config, Fact — stored as 4096-dim vectors in Qdrant.
A three-tier engine keeps recent messages at full fidelity while progressively summarizing older history. Achieves 60-80% compression without losing critical context.
Memory retrieval runs before each user message. Extraction and compression run after each assistant response. Neither blocks the live chat path.
Before every message, cosine-similarity search finds relevant memories from your entire project history and injects them as structured context.
All memories are partitioned by hashed user ID. Every Qdrant query is filtered by this ID. No cross-user access. Metadata is scrubbed of PII during extraction.
Three-level fallback on extraction and retrieval. Exponential-backoff retries. Neither subsystem can crash or block the primary chat loop.
Every extracted memory is classified into a strict type so retrieval understands the nature of the knowledge, not just its text similarity.
Repeatable commands and operational processes. Tracks trigger conditions and dependencies.
Solved problems and post-mortems. Symptom, root cause, solution, and prevention guidance.
Design decisions with rationale, alternatives considered, and known tradeoffs.
Coding styles, naming conventions, and tool choices. Concrete examples included.
Environment settings and infrastructure variables. Name, value, location, and purpose.
General project knowledge that doesn't fit other types. Free-form with semantic keywords.
Install OpenZero in under a minute. Requires Bun, Docker for Qdrant, and an LLM provider key.