draft.mp4
KnowledgeMCP turns any documentation source (websites, PDFs, Confluence, Notion, S3, GitHub) into a standards-compliant Model Context Protocol (MCP) endpoint. Claude, GitHub Copilot, Cursor, and any other MCP-compatible agent can search and read those docs instantly β with no LLM calls at query time (we use a tiny local embedding model + hybrid BM25/kNN search in OpenSearch).
- π MCP-native β three tools (
docs_search,code_sample_search,docs_fetch) any agent can plug into - π° Zero-cost query path β local embeddings + OpenSearch hybrid search. No OpenAI/Bedrock fees per query.
- π³
docker compose upworks β runs fully local, no AWS account, no credit card - βοΈ Production-ready AWS path when you want it β Lambda + DynamoDB + SQS + S3 + managed OpenSearch via the bundled SAM template
git clone https://github.com/hashwnath/KMCP.git
cd KMCP
make up # docker compose up -d --buildThen:
- Dashboard β http://localhost:3000 (signup β add a source β search)
- Admin REST API β http://localhost:8081
- MCP endpoint β http://localhost:8000/mcp/{your-tenant-slug}
First-time start downloads the fastembed model (~30 MB) and OpenSearch (~700 MB image).
Point any MCP client at your tenant URL:
{
"mcpServers": {
"MyDocs": {
"url": "http://localhost:8000/mcp/your-tenant-slug",
"type": "http"
}
}
}The agent gets three tools:
| Tool | Purpose | Returns |
|---|---|---|
docs_search |
semantic + keyword search | up to 10 chunks with title, URL, ~500-token excerpt |
code_sample_search |
code-specific search with optional language filter | up to 20 snippets with language + context |
docs_fetch |
full page content | clean markdown |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Agents (Claude, Cursor, Copilot, Continue, ...) β
ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β POST /mcp/{tenant_slug}
ββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β MCP Server (FastMCP) β docs_search / code_search / fetch β
ββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
ββββββββββββββββΌβββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ ββββββββββββ ββββββββββββββββ
β OpenSearch β β SQLite β β Filesystem β
β (BM25 + kNN) β β tenants β β blobs β
β ~768 token β β sources β β uploads β
β chunks β β jobs β β β
βββββββββββββββββ ββββββββββββ ββββββββββββββββ
β²
ββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ
β Admin API (Starlette) + Background Worker β
β signup/login (JWT) crawl β markdown β chunk β β
β sources CRUD embed β OpenSearch β
β analytics β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(In AWS mode, swap SQLite β DynamoDB, Filesystem β S3, the worker queue β SQS,
and run each service as its own Lambda. The application code is unchanged
because every AWS call routes through src/common/backends/.)
| Type | What it ingests |
|---|---|
website_url |
Full sitemap crawl β markdown |
paste_text |
Inline text |
file_upload |
PDF, DOCX, PPTX, MD, HTML, TXT |
cloud_storage |
S3, Azure Blob, GCS |
wiki_kb |
Confluence, Notion, SharePoint, GitBook |
git_repo |
Public or private GitHub/GitLab repos (token optional) |
Defaults work for local docker-compose. To customise, copy .env.example to .env and edit. The most useful knobs:
| Var | Default | Notes |
|---|---|---|
BACKEND |
local |
local (default) or aws |
EMBEDDING_PROVIDER |
local |
local (fastembed) / bedrock / openai |
LOCAL_EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
Any fastembed-supported model |
OPENSEARCH_ENDPOINT |
http://opensearch:9200 |
In compose; override for hosted OpenSearch |
MAX_DOCS_PER_TENANT |
500 |
Per-tenant quota |
RATE_LIMIT_PER_SECOND |
10 |
MCP endpoint rate limit (per tenant) |
See docs/AWS_DEPLOYMENT.md for the SAM template (Lambda + DynamoDB + SQS + S3 + OpenSearch + SES), cost estimate, and operational runbook.
PRs welcome. See CONTRIBUTING.md for the codebase tour and local dev setup.
make test # full pytest suite (BACKEND=local)
make test-aws # AWS-mocked suite
make up # docker compose up -d --buildThe AGPL-3.0 license means hosted/SaaS use must publish modifications under the same license. If that's a problem for your use case, please open an issue so we can discuss commercial licensing.
| KnowledgeMCP | Typical RAG tools | |
|---|---|---|
| Query cost | $0 (local embeddings + OpenSearch) | $0.01-0.10/query (LLM reranking) |
| Agent integration | Native MCP β plug and play | REST API + custom glue code |
| Self-hosted | docker compose up, no cloud account |
Usually needs cloud APIs |
| Multi-tenant | Per-tenant isolation built-in | Single-tenant, bolt-on later |
| Latency | ~100ms (no LLM in path) | 1-5s (LLM reranking) |
- GitHub Discussions β questions, ideas, show-and-tell
- Issues β bug reports, feature requests
- FastMCP β the MCP server framework
- fastembed β ONNX-runtime embedding library
- OpenSearch β hybrid BM25 + kNN search
- Microsoft Learn MCP server team β for documenting hard-earned lessons that shaped the tenant-context-via-middleware design