--- title: Doc Mcp emoji: 📃 colorFrom: yellow colorTo: pink sdk: gradio sdk_version: 5.33.0 python_version: 3.13 app_file: app.py pinned: false license: mit short_description: 'RAG on documentations for your agent ' --- # Doc-MCP 📚 > Transform GitHub documentation repositories into accessible MCP (Model Context Protocol) servers for AI agents **Hackathon Track**: `mcp-server-track` ## 🎯 What is Doc-MCP? Doc-MCP ingests markdown documentation from GitHub repositories and creates MCP servers that provide easy access to documentation context for AI agents. Just point it at any GitHub repo with markdown docs, and get an intelligent Q&A interface powered by vector search. ## 🛠️ Available MCP Tools ### 📋 Documentation Query Tools #### `get_available_docs_repo` List all available ingested repositories - **Returns**: Array of repository names that have been processed and are available for querying - **Usage**: Get a list of documentation repositories before making queries #### `make_query` Search documentation with AI-powered semantic search - **Parameters**: - `repo` (string): Repository name to search in - `mode` (string): Search strategy - "default", "text_search", or "hybrid" - `query` (string): Natural language question about the documentation - **Returns**: AI-generated response with source citations and metadata - **Usage**: Ask questions about specific documentation repositories ### 📁 GitHub File Operations Tools #### `list_repository_files` Scan and list files in a GitHub repository - **Parameters**: - `repo_url` (string): GitHub repository URL or owner/repo format - `branch` (string, optional): Branch name (default: "main") - `extensions` (string, optional): Comma-separated file extensions (default: ".md,.mdx") - **Returns**: JSON with file list and repository metadata - **Usage**: Discover available documentation files before ingestion #### `get_single_file` Retrieve content of a specific file from GitHub - **Parameters**: - `repo_url` (string): GitHub repository URL or owner/repo format - `file_path` (string): Path to the specific file in the repository - `branch` (string, optional): Branch name (default: "main") - **Returns**: JSON with file content, metadata, and GitHub URLs - **Usage**: Fetch individual documentation files for processing or review #### `get_multiple_files` Retrieve multiple files from GitHub in one request - **Parameters**: - `repo_url` (string): GitHub repository URL or owner/repo format - `file_paths_str` (string): Comma-separated list of file paths - `branch` (string, optional): Branch name (default: "main") - **Returns**: JSON with all file contents, success/failure counts, and metadata - **Usage**: Batch fetch multiple documentation files efficiently ## ✨ Key Features - **GitHub Integration**: Fetch markdown files directly from any GitHub repository - **Vector Search**: Uses MongoDB Atlas with Nebius AI embeddings for semantic search - **MCP Server**: Exposes documentation as MCP endpoints for AI agents - **Smart Q&A**: Ask questions about documentation with source citations - **Repository Management**: Track multiple repositories and their statistics ### 🎯 MCP Server Configuration Add this configuration to your MCP client (Cursor, Windsurf, Cline): ```json { "mcpServers": { "doc-mcp": { "url": "https://agents-mcp-hackathon-doc-mcp.hf.space/gradio_api/mcp/sse" } } } ``` ## 🚀 Quick Start 1. **Setup Environment**: ```bash # Clone and install git clone https://github.com/yourusername/doc-mcp.git cd doc-mcp uv sync # Configure environment cp .env.example .env # Add your GITHUB_API_KEY, NEBIUS_API_KEY and MONGODB_URI ``` 2. **Run the App**: ```bash python main.py # Open http://localhost:7860 ``` 3. **Ingest Documentation**: - Enter a GitHub repo URL (e.g., `gradio-app/gradio`) - Select markdown files to process - Load files and generate vector embeddings 4. **Query Documentation**: - Select your repository - Ask questions about the documentation - Get answers with source citations ## Workflow - Input GitHub URL - Scan for markdown files - Select files to process - Generate embeddings and Store in vector database - Ask questions - Search similar content - Generate contextual answers - Show sources and citations ## 🛠️ Technology Stack - **Interface**: Gradio - **Vector Store**: MongoDB Atlas with vector search - **Embeddings**: Nebius AI (BAAI/bge-en-icl) - **LLM**: Nebius LLM (Llama-3.3-70B-Instruct) - **Document Processing**: LlamaIndex ## 📹 Demo Video [](https://youtu.be/kVzUk4Y6tDA "Doc-MCP Demo - Transform GitHub Docs into AI-Accessible Knowledge") *Click the image above to watch the full demo on YouTube* --- **Transform your documentation into intelligent, accessible knowledge for AI agents!** 🚀