home-lab

geir/home-lab

Fork 0

Commit graph

Author	SHA1	Message	Date
Geir Okkenhaug Jerstad	9d8952c4ce	feat: Complete Ollama CPU optimization for TaskMaster AI - Optimize Ollama service configuration for maximum CPU performance - Increase OLLAMA_NUM_PARALLEL from 2 to 4 workers - Increase OLLAMA_CONTEXT_LENGTH from 4096 to 8192 tokens - Add OLLAMA_KV_CACHE_TYPE=q8_0 for memory efficiency - Set OLLAMA_LLM_LIBRARY=cpu_avx2 for optimal CPU performance - Configure OpenMP threading with 8 threads and core binding - Add comprehensive systemd resource limits and CPU quotas - Remove incompatible NUMA policy setting - Upgrade TaskMaster AI model ecosystem - Main model: qwen3:4b → qwen2.5-coder:7b (specialized coding model) - Research model: deepseek-r1:1.5b → deepseek-r1:7b (enhanced reasoning) - Fallback model: gemma3:4b-it-qat → llama3.3:8b (reliable general purpose) - Create comprehensive optimization and management scripts - Add ollama-optimize.sh for system optimization and benchmarking - Add update-taskmaster-models.sh for TaskMaster configuration management - Include model installation, performance testing, and system info functions - Update TaskMaster AI configuration - Configure optimized models with grey-area:11434 endpoint - Set performance parameters for 8192 context window - Add connection timeout and retry settings - Fix flake configuration issues - Remove nested packages attribute in packages/default.nix - Fix package references in modules/users/geir.nix - Clean up obsolete package files - Add comprehensive documentation - Document complete optimization process and results - Include performance benchmarking results - Provide deployment instructions and troubleshooting guide Successfully deployed via deploy-rs with 3-4x performance improvement estimated. All optimizations tested and verified on grey-area server (24-core Xeon, 31GB RAM).	2025-06-18 13:08:24 +02:00
Geir Okkenhaug Jerstad	bc9869cb67	feat: Add deploy-rs integration with basic configuration - Add deploy-rs as flake input - Configure deploy.nodes for all 4 machines (sleeper-service, grey-area, reverse-proxy, congenital-optimist) - Include safety features: autoRollback, magicRollback, activation timeouts - Add deploy-rs checks for validation - Successfully tested dry-run deployment This completes Tasks 1-3 from the deploy-rs integration roadmap.	2025-06-15 10:03:56 +02:00
Geir Okkenhaug Jerstad	13114d7868	Configure Claude Task Master AI for VS Code MCP integration - Updated .cursor/mcp.json to use local Nix-built Task Master binary - Configured Task Master to use local Ollama models via OpenAI-compatible API - Set up three models: qwen3:4b (main), deepseek-r1:1.5b (research), gemma3:4b-it-qat (fallback) - Created comprehensive integration status documentation - Task Master successfully running as MCP server with 23+ available tools - Ready for VS Code/Cursor AI chat integration	2025-06-14 16:35:09 +02:00

Author

SHA1

Message

Date

Geir Okkenhaug Jerstad

9d8952c4ce

feat: Complete Ollama CPU optimization for TaskMaster AI

- Optimize Ollama service configuration for maximum CPU performance
  - Increase OLLAMA_NUM_PARALLEL from 2 to 4 workers
  - Increase OLLAMA_CONTEXT_LENGTH from 4096 to 8192 tokens
  - Add OLLAMA_KV_CACHE_TYPE=q8_0 for memory efficiency
  - Set OLLAMA_LLM_LIBRARY=cpu_avx2 for optimal CPU performance
  - Configure OpenMP threading with 8 threads and core binding
  - Add comprehensive systemd resource limits and CPU quotas
  - Remove incompatible NUMA policy setting

- Upgrade TaskMaster AI model ecosystem
  - Main model: qwen3:4b → qwen2.5-coder:7b (specialized coding model)
  - Research model: deepseek-r1:1.5b → deepseek-r1:7b (enhanced reasoning)
  - Fallback model: gemma3:4b-it-qat → llama3.3:8b (reliable general purpose)

- Create comprehensive optimization and management scripts
  - Add ollama-optimize.sh for system optimization and benchmarking
  - Add update-taskmaster-models.sh for TaskMaster configuration management
  - Include model installation, performance testing, and system info functions

- Update TaskMaster AI configuration
  - Configure optimized models with grey-area:11434 endpoint
  - Set performance parameters for 8192 context window
  - Add connection timeout and retry settings

- Fix flake configuration issues
  - Remove nested packages attribute in packages/default.nix
  - Fix package references in modules/users/geir.nix
  - Clean up obsolete package files

- Add comprehensive documentation
  - Document complete optimization process and results
  - Include performance benchmarking results
  - Provide deployment instructions and troubleshooting guide

Successfully deployed via deploy-rs with 3-4x performance improvement estimated.
All optimizations tested and verified on grey-area server (24-core Xeon, 31GB RAM).

2025-06-18 13:08:24 +02:00

Geir Okkenhaug Jerstad

bc9869cb67

feat: Add deploy-rs integration with basic configuration

- Add deploy-rs as flake input
- Configure deploy.nodes for all 4 machines (sleeper-service, grey-area, reverse-proxy, congenital-optimist)
- Include safety features: autoRollback, magicRollback, activation timeouts
- Add deploy-rs checks for validation
- Successfully tested dry-run deployment

This completes Tasks 1-3 from the deploy-rs integration roadmap.

2025-06-15 10:03:56 +02:00

Geir Okkenhaug Jerstad

13114d7868

Configure Claude Task Master AI for VS Code MCP integration

- Updated .cursor/mcp.json to use local Nix-built Task Master binary
- Configured Task Master to use local Ollama models via OpenAI-compatible API
- Set up three models: qwen3:4b (main), deepseek-r1:1.5b (research), gemma3:4b-it-qat (fallback)
- Created comprehensive integration status documentation
- Task Master successfully running as MCP server with 23+ available tools
- Ready for VS Code/Cursor AI chat integration

2025-06-14 16:35:09 +02:00

3 commits