# Product Requirements Document: Guile Home Lab Tool with MCP Integration ## Executive Summary Migrate the existing Bash-based home lab management tool to GNU Guile Scheme and implement a Model Context Protocol (MCP) server for seamless VS Code/GitHub Copilot integration. This migration will provide improved maintainability, error handling, and advanced AI-assisted operations through direct IDE integration. ## Project Scope ### Primary Objectives 1. **Core Tool Migration**: Replace the current Bash `lab` command with a functional equivalent written in GNU Guile Scheme 2. **MCP Server Implementation**: Create a Model Context Protocol server that exposes home lab operations to AI assistants 3. **VS Code Extension**: Develop a TypeScript extension that integrates the MCP server with VS Code and GitHub Copilot 4. **Enhanced Functionality**: Add real-time monitoring, infrastructure discovery, and AI-assisted operations 5. **NixOS Integration**: Ensure seamless integration with existing NixOS infrastructure and deployment workflows ### Secondary Objectives 1. **Advanced Error Handling**: Implement robust error recovery and reporting 2. **Concurrent Operations**: Support parallel deployment and monitoring tasks 3. **Web Interface**: Optional web dashboard for infrastructure overview 4. **Plugin Architecture**: Extensible system for adding new capabilities ## Current System Analysis ### Existing Infrastructure - **Machines**: congenital-optimist (local), sleeper-service (NFS), grey-area (Git), reverse-proxy - **Services**: Ollama AI (8B/7B models), Forgejo, Jellyfin, Calibre-web, SearXNG, NFS - **Deployment**: NixOS flakes with manual SSH/rsync deployment - **Monitoring**: Basic connectivity checking via SSH ### Current Tool Capabilities - Multi-machine deployment (boot/test/switch modes) - Infrastructure status monitoring - SSH-based remote operations - Flake updates and hybrid deployment strategies - Color-coded logging and error reporting ## Technical Requirements ### Architecture Requirements 1. **Functional Programming**: Pure functions for core logic, side effects isolated 2. **Module Structure**: Clean separation of concerns (lab/, mcp/, utils/) 3. **Data-Driven Design**: Configuration and state as immutable data structures 4. **Protocol Compliance**: Full MCP 2024-11-05 specification support 5. **Error Resilience**: Graceful degradation and automatic recovery ### Performance Requirements 1. **Startup Time**: < 500ms for basic commands 2. **Deployment Speed**: Maintain or improve current deployment times 3. **Memory Usage**: < 50MB baseline, < 200MB during operations 4. **Concurrent Operations**: Support 3+ parallel machine deployments 5. **Response Time**: MCP requests < 100ms for simple operations ### Compatibility Requirements 1. **NixOS Integration**: Native integration with NixOS flakes and services 2. **Existing Workflows**: Drop-in replacement for current `lab` command 3. **SSH Configuration**: Use existing SSH keys and connection patterns 4. **Tool Dependencies**: Leverage existing system tools (nixos-rebuild, ssh, etc.) ## Functional Specifications ### Core Tool Features #### 1. Machine Management - **Discovery**: Automatic detection of home lab machines - **Health Monitoring**: Real-time status checking and metrics collection - **Deployment**: Multiple deployment strategies (local, SSH, deploy-rs) - **Configuration**: Machine-specific configurations and role definitions #### 2. Service Operations - **Service Discovery**: Automatic detection of running services - **Status Monitoring**: Health checks and performance metrics - **Log Management**: Centralized log collection and analysis - **Backup Coordination**: Automated backup and restore operations #### 3. Infrastructure Operations - **Network Topology**: Discovery and visualization of network structure - **Security Scanning**: Automated security checks and vulnerability assessment - **Resource Monitoring**: CPU, memory, disk, and network utilization - **Performance Analysis**: Historical metrics and trend analysis ### MCP Server Features #### 1. Core MCP Tools - `deploy_machine`: Deploy configurations to specific machines - `check_infrastructure`: Comprehensive infrastructure health check - `monitor_services`: Real-time service monitoring and alerting - `update_system`: System updates with rollback capability - `backup_data`: Automated backup operations - `restore_system`: System restore from backups #### 2. Resource Endpoints - `homelab://machines/{machine}`: Machine configuration and status - `homelab://services/{service}`: Service details and logs - `homelab://network/topology`: Network structure and connectivity - `homelab://metrics/{type}`: Performance and monitoring data - `homelab://logs/{service}`: Centralized log access #### 3. AI Assistant Integration - **Context Awareness**: AI understands current infrastructure state - **Intelligent Suggestions**: Proactive recommendations for optimization - **Natural Language Operations**: Execute commands via natural language - **Documentation Integration**: Automatic documentation generation ### VS Code Extension Features #### 1. MCP Client Implementation - **Connection Management**: Robust MCP server connection handling - **Request Routing**: Efficient request/response handling - **Error Recovery**: Automatic reconnection and retry logic - **Performance Monitoring**: Track MCP server performance metrics #### 2. User Interface - **Status Bar Integration**: Real-time infrastructure status display - **Command Palette**: Quick access to home lab operations - **Explorer Integration**: File tree integration for configuration files - **Output Channels**: Structured output display for operations #### 3. GitHub Copilot Integration - **Context Enhancement**: Provide infrastructure context to Copilot - **Code Suggestions**: Infrastructure-aware code completions - **Documentation**: Automated documentation generation - **Best Practices**: Enforce home lab coding standards ## Implementation Architecture ### Module Structure ``` lab/ # Core home lab functionality ├── core/ # Essential operations ├── machines/ # Machine-specific operations ├── deployment/ # Deployment strategies ├── monitoring/ # Health and performance monitoring └── config/ # Configuration management mcp/ # Model Context Protocol implementation ├── server/ # Core MCP server ├── tools/ # MCP tool implementations └── resources/ # MCP resource endpoints utils/ # Shared utilities ├── ssh/ # SSH operations ├── json/ # JSON processing ├── logging/ # Logging and output └── config/ # Configuration parsing ``` ### Key Libraries and Dependencies 1. **Tier 1 (Essential)**: - `guile-ssh`: SSH operations and remote execution - `guile-json`: JSON processing for MCP protocol - `scheme-json-rpc`: JSON-RPC implementation - `guile-webutils`: HTTP server functionality 2. **Tier 2 (Enhanced Features)**: - `guile-websocket`: WebSocket support for real-time updates - `artanis`: Web framework for dashboard - `guile-curl`: HTTP client operations - `guile-config`: Advanced configuration management 3. **Tier 3 (Future Enhancements)**: - `guile-daemon`: Background process management - `guile-ncurses`: Terminal user interface - `g-wrap`: C library integration - `guile-dbi`: Database connectivity ## Quality Requirements ### Testing Strategy 1. **Unit Testing**: Test individual functions with srfi-64 2. **Integration Testing**: Test module interactions 3. **End-to-End Testing**: Full workflow validation 4. **Performance Testing**: Benchmark against current tool 5. **MCP Compliance**: Protocol conformance testing ### Security Requirements 1. **SSH Key Management**: Secure handling of authentication credentials 2. **Input Validation**: Comprehensive input sanitization 3. **Privilege Separation**: Minimal privilege operations 4. **Audit Logging**: Complete operation audit trail 5. **Secure Communication**: TLS for MCP protocol when needed ### Documentation Requirements 1. **API Documentation**: Complete MCP tool and resource documentation 2. **User Guide**: Comprehensive usage instructions 3. **Developer Guide**: Architecture and extension documentation 4. **Migration Guide**: Transition instructions from Bash tool 5. **Troubleshooting**: Common issues and solutions ## Success Criteria ### Functional Success - [ ] Complete feature parity with existing Bash tool - [ ] MCP server passes all protocol compliance tests - [ ] VS Code extension successfully integrates with GitHub Copilot - [ ] 99.9% uptime for MCP server operations - [ ] Zero data loss during migration ### Performance Success - [ ] 20% improvement in deployment speed - [ ] 50% reduction in error rates - [ ] 30% faster infrastructure status checking - [ ] Sub-second response times for common operations - [ ] Support for 5+ concurrent operations ### User Experience Success - [ ] Seamless transition for existing users - [ ] Intuitive VS Code integration - [ ] Comprehensive error messages and recovery suggestions - [ ] Self-documenting configuration options - [ ] Minimal learning curve for basic operations ## Risk Assessment ### Technical Risks 1. **Learning Curve**: Guile Scheme adoption may slow initial development 2. **Library Maturity**: Some Guile libraries may lack features or have bugs 3. **Performance**: Interpreted language may impact performance for intensive operations 4. **Integration Complexity**: MCP protocol implementation complexity ### Mitigation Strategies 1. **Incremental Migration**: Gradual replacement of Bash functionality 2. **Fallback Mechanisms**: Ability to call existing Bash tools when needed 3. **Performance Monitoring**: Continuous benchmarking and optimization 4. **Community Support**: Leverage Guile community resources and documentation ## Timeline and Milestones ### Phase 1: Foundation (Weeks 1-2) - Core module structure implementation - Basic SSH and deployment functionality - Initial testing framework setup ### Phase 2: Core Features (Weeks 3-4) - Complete machine management implementation - Service monitoring and health checks - Configuration management system ### Phase 3: MCP Integration (Weeks 5-6) - MCP server protocol implementation - Core MCP tools development - Resource endpoint implementation ### Phase 4: VS Code Extension (Weeks 7-8) - TypeScript extension development - MCP client implementation - GitHub Copilot integration ### Phase 5: Enhancement (Weeks 9-10) - Advanced monitoring features - Web dashboard implementation - Performance optimization ### Phase 6: Production (Weeks 11-12) - Comprehensive testing and validation - Documentation completion - Migration and deployment ## Maintenance and Support ### Ongoing Requirements 1. **Security Updates**: Regular security patch integration 2. **Library Updates**: Keep dependencies current 3. **Feature Enhancement**: Continuous improvement based on usage 4. **Bug Fixes**: Rapid response to issues 5. **Documentation**: Keep documentation current with changes ### Long-term Evolution 1. **Plugin Ecosystem**: Support for third-party extensions 2. **Cloud Integration**: Support for cloud-based infrastructure 3. **Multi-User Support**: Team collaboration features 4. **AI Enhancement**: Advanced AI-assisted operations 5. **Mobile Support**: Mobile access to infrastructure management