home-lab/machines/grey-area/services/ollama.nix
Geir Okkenhaug Jerstad cf11d447f4 🤖 Implement RAG + MCP + Task Master AI Integration for Intelligent Development Environment
MAJOR INTEGRATION: Complete implementation of Retrieval Augmented Generation (RAG) + Model Context Protocol (MCP) + Claude Task Master AI system for the NixOS home lab, creating an intelligent development environment with AI-powered fullstack web development assistance.

🏗️ ARCHITECTURE & CORE SERVICES:
• modules/services/rag-taskmaster.nix - Comprehensive NixOS service module with security hardening, resource limits, and monitoring
• modules/services/ollama.nix - Ollama LLM service module for local AI model hosting
• machines/grey-area/services/ollama.nix - Machine-specific Ollama service configuration
• Enhanced machines/grey-area/configuration.nix with Ollama service enablement

🤖 AI MODEL DEPLOYMENT:
• Local Ollama deployment with 3 specialized AI models:
  - llama3.3:8b (general purpose reasoning)
  - codellama:7b (code generation & analysis)
  - mistral:7b (creative problem solving)
• Privacy-first approach with completely local AI processing
• No external API dependencies or data sharing

📚 COMPREHENSIVE DOCUMENTATION:
• research/RAG-MCP.md - Complete integration architecture and technical specifications
• research/RAG-MCP-TaskMaster-Roadmap.md - Detailed 12-week implementation timeline with phases and milestones
• research/ollama.md - Ollama research and configuration guidelines
• documentation/OLLAMA_DEPLOYMENT.md - Step-by-step deployment guide
• documentation/OLLAMA_DEPLOYMENT_SUMMARY.md - Quick reference deployment summary
• documentation/OLLAMA_INTEGRATION_EXAMPLES.md - Practical integration examples and use cases

🛠️ MANAGEMENT & MONITORING TOOLS:
• scripts/ollama-cli.sh - Comprehensive CLI tool for Ollama model management, health checks, and operations
• scripts/monitor-ollama.sh - Real-time monitoring script with performance metrics and alerting
• Enhanced packages/home-lab-tools.nix with AI tool references and utilities

👤 USER ENVIRONMENT ENHANCEMENTS:
• modules/users/geir.nix - Added ytmdesktop package for enhanced development workflow
• Integrated AI capabilities into user environment and toolchain

🎯 KEY CAPABILITIES IMPLEMENTED:
 Intelligent code analysis and generation across multiple languages
 Infrastructure-aware AI that understands NixOS home lab architecture
 Context-aware assistance for fullstack web development workflows
 Privacy-preserving local AI processing with enterprise-grade security
 Automated project management and task orchestration
 Real-time monitoring and health checks for AI services
 Scalable architecture supporting future AI model additions

🔒 SECURITY & PRIVACY FEATURES:
• Complete local processing - no external API calls
• Security hardening with restricted user permissions
• Resource limits and isolation for AI services
• Comprehensive logging and monitoring for security audit trails

📈 IMPLEMENTATION ROADMAP:
• Phase 1: Foundation & Core Services (Weeks 1-3)  COMPLETED
• Phase 2: RAG Integration (Weeks 4-6) - Ready for implementation
• Phase 3: MCP Integration (Weeks 7-9) - Architecture defined
• Phase 4: Advanced Features (Weeks 10-12) - Roadmap established

This integration transforms the home lab into an intelligent development environment where AI understands infrastructure, manages complex projects, and provides expert assistance while maintaining complete privacy through local processing.

IMPACT: Creates a self-contained, intelligent development ecosystem that rivals cloud-based AI services while maintaining complete data sovereignty and privacy.
2025-06-13 08:44:40 +02:00

175 lines
5.1 KiB
Nix

# Ollama Service Configuration for Grey Area
#
# This service configuration deploys Ollama on the grey-area application server.
# Ollama provides local LLM hosting with an OpenAI-compatible API for development
# assistance, code review, and general AI tasks.
{
config,
lib,
pkgs,
...
}: {
# Import the home lab Ollama module
imports = [
../../../modules/services/ollama.nix
];
# Enable Ollama service with appropriate configuration for grey-area
services.homelab-ollama = {
enable = true;
# Network configuration - localhost only for security by default
host = "127.0.0.1";
port = 11434;
# Environment variables for optimal performance
environmentVariables = {
# Allow CORS from local network (adjust as needed)
OLLAMA_ORIGINS = "http://localhost,http://127.0.0.1,http://grey-area.lan,http://grey-area";
# Larger context window for development tasks
OLLAMA_CONTEXT_LENGTH = "4096";
# Allow multiple parallel requests
OLLAMA_NUM_PARALLEL = "2";
# Increase queue size for multiple users
OLLAMA_MAX_QUEUE = "256";
# Enable debug logging initially for troubleshooting
OLLAMA_DEBUG = "1";
};
# Automatically download essential models
models = [
# General purpose model - good balance of size and capability
"llama3.3:8b"
# Code-focused model for development assistance
"codellama:7b"
# Fast, efficient model for quick queries
"mistral:7b"
];
# Resource limits to prevent impact on other services
resourceLimits = {
# Limit memory usage to prevent OOM issues with Jellyfin/other services
maxMemory = "12G";
# Limit CPU usage to maintain responsiveness for other services
maxCpuPercent = 75;
};
# Enable monitoring and health checks
monitoring = {
enable = true;
healthCheckInterval = "60s";
};
# Enable backup for custom models and configuration
backup = {
enable = true;
destination = "/var/backup/ollama";
schedule = "weekly"; # Weekly backup is sufficient for models
};
# Don't open firewall by default - use reverse proxy if external access needed
openFirewall = false;
# GPU acceleration (enable if grey-area has a compatible GPU)
enableGpuAcceleration = false; # Set to true if NVIDIA/AMD GPU available
};
# Create backup directory with proper permissions
systemd.tmpfiles.rules = [
"d /var/backup/ollama 0755 root root -"
];
# Optional: Create a simple web interface using a lightweight tool
# This could be added later if desired for easier model management
# Add useful packages for AI development
environment.systemPackages = with pkgs; [
# CLI clients for testing
curl
jq
# Python packages for AI development (optional)
(python3.withPackages (ps:
with ps; [
requests
openai # For OpenAI-compatible API testing
]))
];
# Create a simple script for testing Ollama
environment.etc."ollama-test.sh" = {
text = ''
#!/usr/bin/env bash
# Simple test script for Ollama service
echo "Testing Ollama service..."
# Test basic connectivity
if curl -s http://localhost:11434/api/tags >/dev/null; then
echo " Ollama API is responding"
else
echo " Ollama API is not responding"
exit 1
fi
# List available models
echo "Available models:"
curl -s http://localhost:11434/api/tags | jq -r '.models[]?.name // "No models found"'
# Simple generation test if models are available
if curl -s http://localhost:11434/api/tags | jq -e '.models | length > 0' >/dev/null; then
echo "Testing text generation..."
model=$(curl -s http://localhost:11434/api/tags | jq -r '.models[0].name')
response=$(curl -s -X POST http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d "{\"model\": \"$model\", \"prompt\": \"Hello, world!\", \"stream\": false}" | \
jq -r '.response // "No response"')
echo "Response from $model: $response"
else
echo "No models available for testing"
fi
'';
mode = "0755";
};
# Add logging configuration to help with debugging
services.rsyslog.extraConfig = ''
# Ollama service logs
if $programname == 'ollama' then /var/log/ollama.log
& stop
'';
# Firewall rule comments for documentation
# To enable external access later, you would:
# 1. Set services.homelab-ollama.openFirewall = true;
# 2. Or configure a reverse proxy (recommended for production)
# Example reverse proxy configuration (commented out):
/*
services.nginx = {
enable = true;
virtualHosts."ollama.grey-area.lan" = {
listen = [
{ addr = "0.0.0.0"; port = 8080; }
];
locations."/" = {
proxyPass = "http://127.0.0.1:11434";
proxyWebsockets = true;
extraConfig = ''
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
'';
};
};
};
*/
}