Give your AI a memory that never leaves your machine.
Offline-first semantic memory engine โ single binary, zero config, 30ms recall.
๐ฌ๐ง English ยท ๐ฎ๐ฉ Bahasa Indonesia
# Install (macOS, Linux, Windows)
curl -sSL codecora.dev/install | sh
# Store a memory with metadata
uteke remember "Deploy v2.1 to staging" --tags deploy,staging \
--entity staging-server --category infrastructure
# Hybrid search (vector + FTS5, ranked by RRF)
uteke recall "when do we deploy?"
# Stats
uteke statsThat's it. No API keys. No Docker. No Python. First run downloads the embedding model (~188MB) and you're good to go.
๐ Install options ยท Pre-built binaries ยท Docker ยท Full docs
Listens on localhost only by default. See Docker docs for auth setup.
# One-liner (model pre-baked in image)
docker run -d --name uteke -p 127.0.0.1:8767:8767 -v uteke-data:/data \
ghcr.io/codecoradev/uteke:latest
# Or docker compose
docker compose up -d๐ Docker docs ยท Compose file
AI agents forget everything between sessions. Uteke gives them persistent, searchable memory โ entirely offline, in one binary.
| Uteke | Mem0 | Letta | Zep | |
|---|---|---|---|---|
| Setup | Single binary | pip + Docker + Qdrant | pip + Docker + Postgres | pip + Docker + Neo4j |
| API keys needed | โ None | โ OpenAI/LLM key | โ LLM key | โ LLM key |
| Offline | โ Fully | โ Cloud embedding | โ Needs LLM server | โ Needs LLM + vector DB |
| Semantic search | โ Local ONNX + FTS5 hybrid | โ Cloud embedding | โ GraphRAG | |
| Full-text search | โ FTS5 built-in | โ | โ | |
| Recall speed | ~30ms (library) | Network round-trip | Network round-trip | Network round-trip |
| Privacy | โ Data never leaves machine | |||
| License | Apache 2.0 | Apache 2.0 | Apache 2.0 | Apache 2.0 |
-
๐ง Hybrid Search โ Vector similarity + FTS5 full-text search, merged by Reciprocal Rank Fusion (RRF)
-
๐ Rooms โ Group memories by context (meetings, projects) with author attribution
-
โณ Time-travel queries โ Recall memories as they existed at any point in time
-
๐ Pluggable embeddings โ Swap ONNX/OpenAI/Ollama backends via config
-
๐ท๏ธ Metadata Enrichment โ Tag, entity, category, and key:value metadata on every memory
-
๐ Relationship graph โ Link memories with typed edges (supersedes, contradicts, references)
-
๐ Smart decay โ Composite importance scoring, pin critical memories
-
โก Recall cache โ LRU cache eliminates redundant embedding for repeated queries
-
๐ Benchmarks โ Built-in
uteke benchfor perf testing + LongMemEval retrieval harness -
๐ฅ Multi-Agent Namespaces โ Fully isolated memory per agent, zero overhead
-
๐ฅ๏ธ Server Mode โ Persistent daemon with ~42ms warm recall (75x faster than CLI)
-
๐ฅ Tiered Memory โ Hot/Warm/Cold tracking with auto-cleanup of stale memories
-
๐ Fully Offline โ Local ONNX embeddings (768d), no telemetry, no cloud, no API calls
-
๐ Embed Fallback โ Gracefully falls back to cloud API if local embedder fails; MockEmbedder for testing
-
๐ Batch Import โ Import entire directories (
--batch-dir) with auto-strategy routing (document vs. memory extraction) -
๐ฆ Single Binary โ Zero dependencies. No Docker, no database server, no Python, no API keys
-
๐ฅ Import/Export โ JSONL-based backup and restore
-
๐งฉ Memory Types โ Typed categories (fact, procedure, decision, etc.) with auto-inference
-
๐ Backlinks โ Bidirectional memory edges โ references are automatically reciprocal
-
๐ Timeline Events โ Chronological audit log per memory (created, updated, superseded)
-
๐ Salience + Recency โ Dual-axis recall boost by memory type and age
-
๐ Dream Cycle โ One-command maintenance pipeline (lint โ backlinks โ dedup โ orphans)
-
๐ Orphan Detection โ Find disconnected, low-importance memories for cleanup
-
๐ Citations โ Source attribution on every memory (URL, file, user, import)
-
๐ MCP Server โ JSON-RPC over stdio + Streamable HTTP transport
-
๐ Document Engine โ Wiki/knowledge base with
uteke doc create/get/listand auto-chunking -
๐ค Cosine Auto-Linking โ Automatically creates
similar_toedges between related memories -
๐ Graph API โ
GET /graphendpoint returns nodes + edges JSON for visualization -
๐ View-Only API Keys โ Read-only tokens for safe GET-only access to the server
-
๐ Markdown Chunker โ Splits documents by headings, respects code blocks and token limits
๐ MCP Server โ configure with Claude Code, Cursor, Hermes
See MCP docs for Claude Desktop, Hermes, and HTTP transport.
๐ Full documentation ยท CLI reference ยท Configuration
Hybrid search pipeline:
- HNSW (usearch) โ vector similarity, finds by meaning
- FTS5 (SQLite) โ full-text keyword search, finds by exact terms
- Reciprocal Rank Fusion (k=60) โ merges both ranked lists โ best of both worlds
- Local ONNX (EmbeddingGemma Q4, 768d) โ embeddings computed on-device, no API calls
Everything runs in-process. No network. No cloud. No server required (unless you want server mode).
cargo build --workspace # Build
cargo test --workspace # Test (327 unit tests)
cargo clippy -- -D warnings # Lint
cargo fmt # FormatSee CONTRIBUTING.md for the full contribution guide.
Apache License 2.0 โ use it, fork it, ship it.


