Update plan.md: GNU Stow + literate Emacs approach, deploy-rs migration planning
- Phase 4: Restructured to use GNU Stow for regular dotfiles + literate programming for Emacs only - Added comprehensive package structure for Stow deployment - Elevated deploy-rs migration to high priority with detailed configuration examples - Updated status to reflect 4/4 machines fully operational with complete service stack - Added recent critical issue resolution documentation - Updated next phase priorities to reflect new dotfiles approach
This commit is contained in:
parent
4a57978f45
commit
c8bee48ee3
1 changed files with 400 additions and 73 deletions
473
plan.md
473
plan.md
|
@ -111,6 +111,8 @@ Home-lab/
|
|||
- **SSH Infrastructure**: Implemented centralized SSH key management
|
||||
- **Boot Performance**: Clean boot in ~1 minute with ZFS auto-mounting enabled
|
||||
- **Remote Deployment**: Established rsync + SSH deployment workflow
|
||||
- **NFS Server**: Configured NFS exports for both local (10.0.0.0/24) and Tailscale (100.64.0.0/10) networks
|
||||
- **Network Configuration**: Updated to use Tailscale IPs for reliable mesh connectivity
|
||||
|
||||
#### Technical Solutions:
|
||||
- **ZFS Native Mounting**: Migrated from legacy mountpoints to ZFS native paths
|
||||
|
@ -118,26 +120,91 @@ Home-lab/
|
|||
- **Graphics Compatibility**: Added `nomodeset` kernel parameter, disabled NVIDIA drivers
|
||||
- **DNS Configuration**: Multi-tier DNS with Pi-hole primary, router and Google fallback
|
||||
- **Deployment Method**: Remote deployment via rsync + SSH instead of direct nixos-rebuild
|
||||
- **NFS Exports**: Resolved dataset conflicts by commenting out conflicting tmpfiles rules
|
||||
- **Network Access**: Added Tailscale interface (tailscale0) as trusted interface in firewall
|
||||
|
||||
#### Data Verified:
|
||||
- **Storage Pool**: 903GB used, 896GB available
|
||||
- **Media Content**: Films (184GB), Series (612GB), Audiobooks (94GB), Music (9.1GB), Books (3.5GB)
|
||||
- **Mount Points**: `/mnt/storage` and `/mnt/storage/media` with proper ZFS auto-mounting
|
||||
- **NFS Access**: Both datasets exported with proper permissions for network access
|
||||
|
||||
#### Next Steps for sleeper-service:
|
||||
- [ ] Implement automated backup services
|
||||
- [ ] Add system monitoring and alerting
|
||||
- [ ] Configure additional NFS exports as needed
|
||||
- [ ] Plan storage expansion strategy
|
||||
### grey-area Deployment (COMPLETED) ✅ NEW
|
||||
**Date**: June 2025
|
||||
**Status**: ✅ Fully operational
|
||||
**Machine**: Intel Xeon E5-2670 v3 (24 cores) @ 3.10 GHz, 31.24 GiB RAM
|
||||
|
||||
#### Lessons Learned:
|
||||
1. **ZFS Mounting Strategy**: Native ZFS mountpoints are more reliable than legacy mounts in NixOS
|
||||
2. **Remote Deployment**: rsync + SSH approach avoids local machine conflicts during deployment
|
||||
3. **DNS Configuration**: Manual DNS configuration crucial during initial deployment phase
|
||||
4. **Graphics Compatibility**: `nomodeset` parameter essential for headless server deployment
|
||||
5. **Boot Troubleshooting**: ZFS auto-mounting conflicts can be resolved by removing hardware-configuration.nix ZFS entries
|
||||
6. **Data Migration**: ZFS dataset property changes can be done safely without data loss
|
||||
7. **Network Integration**: Pi-hole DNS integration significantly improves package resolution reliability
|
||||
#### Key Achievements:
|
||||
- **Flake Configuration**: Successfully deployed NixOS flake-based configuration
|
||||
- **NFS Client**: Configured reliable NFS mount to sleeper-service media storage via Tailscale
|
||||
- **Service Stack**: Deployed comprehensive application server with multiple services
|
||||
- **Network Integration**: Integrated with centralized extraHosts module using Tailscale IPs
|
||||
- **User Management**: Resolved UID conflicts and implemented consistent user configuration
|
||||
- **Firewall Configuration**: Properly configured ports for all services
|
||||
|
||||
#### Services Deployed:
|
||||
- **Jellyfin**: ✅ Media server with access to NFS-mounted content from sleeper-service
|
||||
- **Calibre-web**: ✅ E-book management and reading interface
|
||||
- **Forgejo**: ✅ Git hosting server (git.geokkjer.eu) with reverse proxy integration
|
||||
- **Audiobook Server**: ✅ Audiobook streaming and management
|
||||
|
||||
#### Technical Implementation:
|
||||
- **NFS Mount**: `/mnt/remote/media` successfully mounting `sleeper-service:/mnt/storage/media`
|
||||
- **Network Path**: Using Tailscale mesh (100.x.x.x) for reliable connectivity
|
||||
- **Mount Options**: Configured with automount, soft mount, and appropriate timeouts
|
||||
- **Firewall Ports**: 22 (SSH), 3000 (Forgejo), 23231 (other services)
|
||||
- **User Configuration**: Fixed UID consistency with centralized sma user module
|
||||
|
||||
#### Data Access Verified:
|
||||
- **Movies**: 38 films accessible via NFS
|
||||
- **TV Series**: 29 series collections
|
||||
- **Music**: 9 music directories
|
||||
- **Audiobooks**: 79 audiobook collections
|
||||
- **Books**: E-book collection
|
||||
- **Media Services**: All content accessible through Jellyfin and other services
|
||||
|
||||
### reverse-proxy Integration (COMPLETED) ✅ NEW
|
||||
**Date**: June 2025
|
||||
**Status**: ✅ Fully operational
|
||||
**Machine**: External VPS (46.226.104.98)
|
||||
|
||||
#### Key Achievements:
|
||||
- **Nginx Configuration**: Successfully configured reverse proxy for Forgejo
|
||||
- **Hostname Resolution**: Fixed hostname mapping from incorrect "apps" to correct "grey-area"
|
||||
- **SSL/TLS**: Configured ACME Let's Encrypt certificate for git.geokkjer.eu
|
||||
- **SSH Forwarding**: Configured SSH proxy on port 1337 for Git operations
|
||||
- **Network Security**: Implemented DMZ-style security with Tailscale-only SSH access
|
||||
|
||||
#### Technical Configuration:
|
||||
- **HTTP Proxy**: `git.geokkjer.eu` → `http://grey-area:3000` (Forgejo)
|
||||
- **SSH Proxy**: Port 1337 → `grey-area:22` for Git SSH operations
|
||||
- **Network Path**: External traffic → reverse-proxy → Tailscale → grey-area
|
||||
- **Security**: SSH restricted to Tailscale network, fail2ban protection
|
||||
- **DNS**: Proper hostname resolution via extraHosts module
|
||||
|
||||
### Centralized Network Configuration (COMPLETED) ✅ NEW
|
||||
**Date**: June 2025
|
||||
**Status**: ✅ Fully operational
|
||||
|
||||
#### Key Achievements:
|
||||
- **extraHosts Module**: Created centralized hostname resolution using Tailscale IPs
|
||||
- **Network Consistency**: All machines use same IP mappings for reliable mesh connectivity
|
||||
- **SSH Configuration**: Updated IP addresses in ssh-keys.nix module
|
||||
- **User Management**: Resolved user configuration conflicts between modules
|
||||
|
||||
#### Network Topology:
|
||||
- **Tailscale Mesh IPs**:
|
||||
- `100.109.28.53` - congenital-optimist (workstation)
|
||||
- `100.81.15.84` - sleeper-service (NFS file server)
|
||||
- `100.119.86.92` - grey-area (application server)
|
||||
- `100.96.189.104` - reverse-proxy (external VPS)
|
||||
- `100.103.143.108` - pihole (DNS server)
|
||||
- `100.126.202.40` - wordpresserver (legacy)
|
||||
|
||||
#### Module Integration:
|
||||
- **extraHosts**: Added to all machine configurations for consistent hostname resolution
|
||||
- **SSH Keys**: Updated IP addresses (grey-area: 10.0.0.12, reverse-proxy: 46.226.104.98)
|
||||
- **User Modules**: Fixed conflicts between sma user definitions in different modules
|
||||
|
||||
### Home Lab Deployment Tool (COMPLETED) ✅ NEW
|
||||
**Date**: Recently completed
|
||||
|
@ -408,29 +475,79 @@ Home-lab/
|
|||
- [ ] Verify shell environment and modern CLI tools work
|
||||
- [ ] Test console theming and TTY setup
|
||||
|
||||
## Phase 4: Literate Dotfiles Setup
|
||||
## Phase 4: Dotfiles & Configuration Management
|
||||
|
||||
### 4.1 Per-User Org-mode Infrastructure
|
||||
- [ ] Create per-user dotfiles directories (`users/geir/dotfiles/`)
|
||||
- [ ] Create comprehensive `users/geir/dotfiles/README.org` with auto-tangling
|
||||
- [ ] Set up Emacs configuration for literate programming workflow
|
||||
- [ ] Configure automatic tangling on save
|
||||
- [ ] Create modular sections for different tool configurations
|
||||
- [ ] Plan for additional users (admin, service accounts, etc.)
|
||||
### 4.1 GNU Stow Infrastructure for Regular Dotfiles ✅ DECIDED
|
||||
**Approach**: Use GNU Stow for traditional dotfiles, literate programming for Emacs only
|
||||
|
||||
### 4.2 Configuration Domains
|
||||
- [ ] Shell configuration (zsh, starship, aliases)
|
||||
- [ ] Editor configurations (emacs, neovim, vscode)
|
||||
- [ ] Development tools and environments
|
||||
- [ ] System-specific tweaks and preferences
|
||||
- [ ] Git configuration and development workflow
|
||||
#### GNU Stow Setup
|
||||
- [ ] Create `~/dotfiles/` directory structure with package-based organization
|
||||
- [ ] Set up core packages: `zsh/`, `git/`, `tmux/`, `starship/`, etc.
|
||||
- [ ] Configure selective deployment per machine (workstation vs servers)
|
||||
- [ ] Create stow deployment scripts for different machine profiles
|
||||
- [ ] Document stow workflow and package management
|
||||
|
||||
### 4.3 Integration with NixOS
|
||||
- [ ] Link org-mode generated configs with NixOS modules where appropriate
|
||||
- [ ] Document the relationship between system-level and user-level configs
|
||||
- [ ] Create per-user configuration templates for common patterns
|
||||
- [ ] Plan user-specific configurations vs shared configurations
|
||||
- [ ] Consider user isolation and security implications
|
||||
#### Package Structure
|
||||
```
|
||||
~/dotfiles/ # Stow directory (target: $HOME)
|
||||
├── zsh/ # Shell configuration
|
||||
│ ├── .zshrc
|
||||
│ ├── .zshenv
|
||||
│ └── .config/zsh/
|
||||
├── git/ # Git configuration
|
||||
│ ├── .gitconfig
|
||||
│ └── .config/git/
|
||||
├── starship/ # Prompt configuration
|
||||
│ └── .config/starship.toml
|
||||
├── tmux/ # Terminal multiplexer
|
||||
│ └── .tmux.conf
|
||||
├── emacs/ # Basic Emacs bootstrap (points to literate config)
|
||||
│ └── .emacs.d/early-init.el
|
||||
└── machine-specific/ # Per-machine configurations
|
||||
├── workstation/
|
||||
└── server/
|
||||
```
|
||||
|
||||
### 4.2 Literate Programming for Emacs Configuration ✅ DECIDED
|
||||
**Approach**: Comprehensive org-mode literate configuration for Emacs only
|
||||
|
||||
#### Emacs Literate Setup
|
||||
- [ ] Create `~/dotfiles/emacs/.emacs.d/configuration.org` as master config
|
||||
- [ ] Set up automatic tangling on save (org-babel-tangle-on-save)
|
||||
- [ ] Modular org sections: packages, themes, keybindings, workflows
|
||||
- [ ] Bootstrap early-init.el to load tangled configuration
|
||||
- [ ] Create machine-specific customizations within org structure
|
||||
|
||||
#### Literate Configuration Structure
|
||||
```
|
||||
~/dotfiles/emacs/.emacs.d/
|
||||
├── early-init.el # Bootstrap (generated by Stow)
|
||||
├── configuration.org # Master literate config
|
||||
├── init.el # Tangled from configuration.org
|
||||
├── modules/ # Tangled module files
|
||||
│ ├── base.el
|
||||
│ ├── development.el
|
||||
│ ├── org-mode.el
|
||||
│ └── ui.el
|
||||
└── machine-config/ # Machine-specific overrides
|
||||
├── workstation.el
|
||||
└── server.el
|
||||
```
|
||||
|
||||
### 4.3 Integration Strategy
|
||||
- [ ] **System-level**: NixOS modules provide system packages and environment
|
||||
- [ ] **User-level**: GNU Stow manages dotfiles and application configurations
|
||||
- [ ] **Emacs-specific**: Org-mode literate programming for comprehensive Emacs setup
|
||||
- [ ] **Per-machine**: Selective stow packages + machine-specific customizations
|
||||
- [ ] **Version control**: Git repository for dotfiles with separate org documentation
|
||||
|
||||
### 4.4 Deployment Workflow
|
||||
- [ ] Create deployment scripts for different machine types:
|
||||
- **Workstation**: Full package deployment (zsh, git, tmux, starship, emacs)
|
||||
- **Server**: Minimal package deployment (zsh, git, basic emacs)
|
||||
- **Development**: Additional packages (language-specific tools, IDE configs)
|
||||
- [ ] Integration with existing `lab` deployment tool
|
||||
- [ ] Documentation for new user onboarding across machines
|
||||
|
||||
## Phase 5: Home Lab Expansion Planning
|
||||
|
||||
|
@ -451,20 +568,27 @@ Home-lab/
|
|||
- [x] Network configuration with Pi-hole DNS integration
|
||||
- [x] System boots cleanly in ~1 minute with ZFS auto-mounting
|
||||
- [x] Data preservation verified (Films: 184GB, Series: 612GB, etc.)
|
||||
- [x] NFS exports configured for both local and Tailscale networks
|
||||
- [x] Resolved dataset conflicts and tmpfiles rule conflicts
|
||||
- [ ] Automated backup services (future enhancement)
|
||||
- [ ] System monitoring and alerting (future enhancement)
|
||||
- [ ] **reverse-proxy** edge server:
|
||||
- Nginx/Traefik/caddy reverse proxy
|
||||
- SSL/TLS termination with Let's Encrypt
|
||||
- External access gateway and load balancing
|
||||
- Security protection (Fail2ban, rate limiting)
|
||||
- Minimal attack surface, headless operation
|
||||
- [ ] **grey-area** application server (Culture GCU - versatile, multi-purpose):
|
||||
- **Primary**: Forgejo Git hosting (repositories, CI/CD, project management)
|
||||
- **Secondary**: Jellyfin media server
|
||||
- **Monitoring**: TBD
|
||||
- **Infrastructure**: Container-focused (Podman), PostgreSQL database
|
||||
- **Integration**: Central Git hosting for all home lab projects
|
||||
- [x] **reverse-proxy** edge server: ✅ **COMPLETED**
|
||||
- [x] Nginx reverse proxy with proper hostname mapping (grey-area vs apps)
|
||||
- [x] SSL/TLS termination with Let's Encrypt for git.geokkjer.eu
|
||||
- [x] External access gateway with DMZ security configuration
|
||||
- [x] SSH forwarding on port 1337 for Git operations
|
||||
- [x] Fail2ban protection and Tailscale-only SSH access
|
||||
- [x] Minimal attack surface, headless operation
|
||||
- [x] **grey-area** application server (Culture GCU - versatile, multi-purpose): ✅ **COMPLETED**
|
||||
- [x] **Primary**: Forgejo Git hosting (git.geokkjer.eu) with reverse proxy integration
|
||||
- [x] **Secondary**: Jellyfin media server with NFS-mounted content
|
||||
- [x] **Additional**: Calibre-web e-book server and audiobook streaming
|
||||
- [x] **Infrastructure**: Container-focused (Podman), NFS client for media storage
|
||||
- [x] **Integration**: Central Git hosting accessible externally via reverse proxy
|
||||
- [x] **Network**: Integrated with Tailscale mesh and centralized hostname resolution
|
||||
- [x] **User Management**: Resolved UID conflicts with centralized sma user configuration
|
||||
- [ ] **Monitoring**: TBD (future enhancement)
|
||||
- [ ] **PostgreSQL**: Plan database services for applications requiring persistent storage
|
||||
- [ ] Plan for additional users across machines:
|
||||
- [x] **geir** - Primary user (development, desktop, daily use)
|
||||
- [x] **sma** - Admin user (Diziet Sma, system administration, security oversight)
|
||||
|
@ -516,18 +640,63 @@ Home-lab/
|
|||
- [ ] Deployment automation
|
||||
- [ ] Monitoring and alerting
|
||||
|
||||
### 6.3 Advanced Deployment Strategies
|
||||
- [ ] **Research deploy-rs**: Investigate deploy-rs as alternative to custom lab script
|
||||
- Evaluate Rust-based deployment tool for NixOS flakes
|
||||
- Compare features: parallel deployment, rollback capabilities, health checks
|
||||
- Assess integration with existing SSH key management and Tailscale network
|
||||
- Consider migration path from current rsync + SSH approach
|
||||
- [ ] **Convert lab script to Guile Scheme**: Explore functional deployment scripting
|
||||
- Research Guile Scheme for system administration scripting
|
||||
- Evaluate benefits: better error handling, functional composition, extensibility
|
||||
- Design modular deployment pipeline with Scheme
|
||||
- Consider integration with GNU Guix deployment patterns
|
||||
- Plan migration strategy from current shell script implementation
|
||||
### 6.3 Advanced Deployment Strategies ✅ RESEARCH COMPLETED
|
||||
|
||||
#### Deploy-rs Migration (Priority: High) 📋 RESEARCHED
|
||||
- [x] **Research deploy-rs capabilities** ✅ COMPLETED
|
||||
- [x] Rust-based deployment tool specifically designed for NixOS flakes
|
||||
- [x] Features: parallel deployment, automatic rollback, health checks, SSH-based
|
||||
- [x] Advanced capabilities: atomic deployments, magic rollback on failure
|
||||
- [x] Profile management: system, user, and custom profiles support
|
||||
- [x] Integration potential: Works with existing SSH keys and Tailscale network
|
||||
|
||||
- [ ] **Migration Planning**: Transition from custom `lab` script to deploy-rs
|
||||
- [ ] Create deploy-rs configuration in flake.nix for all 4 machines
|
||||
- [ ] Configure nodes: sleeper-service, grey-area, reverse-proxy, congenital-optimist
|
||||
- [ ] Set up health checks for critical services (NFS, Forgejo, Jellyfin, nginx)
|
||||
- [ ] Test parallel deployment capabilities across infrastructure
|
||||
- [ ] Implement automatic rollback for failed deployments
|
||||
- [ ] Document migration benefits and new deployment workflow
|
||||
|
||||
#### Deploy-rs Configuration Structure
|
||||
```nix
|
||||
# flake.nix additions
|
||||
deploy.nodes = {
|
||||
sleeper-service = {
|
||||
hostname = "100.81.15.84"; # Tailscale IP
|
||||
profiles.system.path = deploy-rs.lib.x86_64-linux.activate.nixos
|
||||
self.nixosConfigurations.sleeper-service;
|
||||
profiles.system.user = "root";
|
||||
};
|
||||
grey-area = {
|
||||
hostname = "100.119.86.92";
|
||||
profiles.system.path = deploy-rs.lib.x86_64-linux.activate.nixos
|
||||
self.nixosConfigurations.grey-area;
|
||||
# Health checks for Forgejo, Jellyfin services
|
||||
};
|
||||
reverse-proxy = {
|
||||
hostname = "100.96.189.104";
|
||||
profiles.system.path = deploy-rs.lib.x86_64-linux.activate.nixos
|
||||
self.nixosConfigurations.reverse-proxy;
|
||||
# Health checks for nginx, SSL certificates
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
#### Migration Benefits
|
||||
- **Atomic deployments**: Complete success or automatic rollback
|
||||
- **Parallel deployment**: Deploy to multiple machines simultaneously
|
||||
- **Health checks**: Validate services after deployment
|
||||
- **Connection resilience**: Better handling of SSH/network issues
|
||||
- **Flake-native**: Designed specifically for NixOS flake workflows
|
||||
- **Safety**: Magic rollback prevents broken deployments
|
||||
|
||||
#### Alternative: Guile Scheme Exploration (Priority: Low)
|
||||
- [ ] **Research Guile Scheme for system administration**
|
||||
- [ ] Evaluate functional deployment scripting patterns
|
||||
- [ ] Compare with current shell script and deploy-rs approaches
|
||||
- [ ] Consider integration with GNU Guix deployment patterns
|
||||
- [ ] Assess learning curve vs. practical benefits for home lab use case
|
||||
### 6.4 Writeup
|
||||
- [ ] Take all the knowledge we have amassed and make a blog post or a series of blog posts
|
||||
|
||||
|
@ -560,20 +729,114 @@ Home-lab/
|
|||
- Document manual recovery procedures
|
||||
- Preserve current user configuration during migration
|
||||
|
||||
## Current Status Overview (Updated December 2024)
|
||||
|
||||
### Infrastructure Deployment Status ✅ MAJOR MILESTONE ACHIEVED
|
||||
✅ **PHASE 1**: Flakes Migration - **COMPLETED**
|
||||
✅ **PHASE 2**: Configuration Cleanup - **COMPLETED**
|
||||
✅ **PHASE 3**: System Upgrade & Validation - **COMPLETED**
|
||||
✅ **PHASE 5**: Home Lab Expansion - **4/4 MACHINES FULLY OPERATIONAL** 🎉
|
||||
|
||||
### Machine Status
|
||||
- ✅ **congenital-optimist**: Development workstation (fully operational)
|
||||
- ✅ **sleeper-service**: NFS file server with 903GB media library (fully operational)
|
||||
- ✅ **grey-area**: Application server with Forgejo, Jellyfin, Calibre-web, audiobook server (fully operational)
|
||||
- ✅ **reverse-proxy**: External gateway with nginx, SSL termination, SSH forwarding (fully operational)
|
||||
|
||||
### Network Architecture Status
|
||||
- ✅ **Tailscale Mesh**: All machines connected via secure mesh network (100.x.x.x addresses)
|
||||
- ✅ **Hostname Resolution**: Centralized extraHosts module deployed across all machines
|
||||
- ✅ **NFS Storage**: Reliable media storage access via Tailscale network (sleeper-service → grey-area)
|
||||
- ✅ **External Access**: Public services accessible via git.geokkjer.eu with SSL
|
||||
- ✅ **SSH Infrastructure**: Centralized key management with role-based access patterns
|
||||
- ✅ **Firewall Configuration**: Service ports properly configured across all machines
|
||||
|
||||
### Services Status - FULLY OPERATIONAL STACK 🚀
|
||||
- ✅ **Git Hosting**: Forgejo operational at git.geokkjer.eu with SSH access on port 1337
|
||||
- ✅ **Media Streaming**: Jellyfin with NFS-mounted content library (38 movies, 29 TV series)
|
||||
- ✅ **E-book Management**: Calibre-web for book collections
|
||||
- ✅ **Audiobook Streaming**: Audiobook server with 79 audiobook collections
|
||||
- ✅ **File Storage**: NFS server with 903GB media library accessible across network
|
||||
- ✅ **Web Gateway**: Nginx reverse proxy with Let's Encrypt SSL and proper hostname mapping
|
||||
- ✅ **User Management**: Consistent UID/GID configuration across machines (sma user: 1001/992)
|
||||
|
||||
### Infrastructure Achievements - COMPREHENSIVE DEPLOYMENT ✅
|
||||
- ✅ **NFS Mount Resolution**: Fixed grey-area `/mnt/storage` → `/mnt/storage/media` dataset access
|
||||
- ✅ **Network Exports**: Updated sleeper-service NFS exports for Tailscale network (100.64.0.0/10)
|
||||
- ✅ **Service Discovery**: Corrected reverse-proxy hostname mapping from "apps" to "grey-area"
|
||||
- ✅ **Firewall Management**: Added port 3000 for Forgejo service accessibility
|
||||
- ✅ **SSH Forwarding**: Configured SSH proxy on port 1337 for Git operations
|
||||
- ✅ **SSL Termination**: Let's Encrypt certificates working for git.geokkjer.eu
|
||||
- ✅ **Data Verification**: All media content accessible (movies, TV, music, audiobooks, books)
|
||||
- ✅ **Deployment Tools**: Custom `lab` command operational for infrastructure management
|
||||
|
||||
### Current Operational Status
|
||||
**🟢 ALL CORE INFRASTRUCTURE DEPLOYED AND OPERATIONAL**
|
||||
- **4/4 machines deployed** with full service stack
|
||||
- **External access verified**: `curl -I https://git.geokkjer.eu` returns HTTP/2 200
|
||||
- **NFS connectivity confirmed**: Media files accessible across network via Tailscale
|
||||
- **Service integration complete**: Forgejo, Jellyfin, Calibre-web, audiobook server running
|
||||
- **Network mesh stable**: All machines connected via Tailscale with centralized hostname resolution
|
||||
|
||||
### Next Phase Priorities
|
||||
- [ ] **PHASE 4**: GNU Stow + Literate Emacs Setup
|
||||
- [ ] Set up GNU Stow infrastructure for regular dotfiles (zsh, git, tmux, starship)
|
||||
- [ ] Create comprehensive Emacs literate configuration with org-mode
|
||||
- [ ] Implement selective deployment per machine type (workstation vs server)
|
||||
- [ ] Integration with existing NixOS system-level configuration
|
||||
- [ ] **PHASE 6**: Advanced Features & Deploy-rs Migration
|
||||
- [ ] Migrate from custom `lab` script to deploy-rs for improved deployment
|
||||
- [ ] Implement system monitoring and alerting infrastructure
|
||||
- [ ] Set up automated backup services for critical data
|
||||
- [ ] Create health checks and deployment validation
|
||||
- [ ] **Documentation & Knowledge Sharing**
|
||||
- [ ] Comprehensive blog post series documenting the full home lab journey
|
||||
- [ ] User guides for GNU Stow + literate Emacs configuration workflow
|
||||
- [ ] Deploy-rs migration guide and lessons learned
|
||||
- [ ] **Future Enhancements**
|
||||
- [ ] User ID consistency cleanup (sma user UID alignment across machines)
|
||||
- [ ] CI/CD integration with Forgejo for automated testing and deployment
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] System boots reliably with flake configuration
|
||||
- [ ] All current functionality preserved
|
||||
- [ ] NixOS 25.05 running stable
|
||||
- [ ] Configuration is modular and maintainable
|
||||
- [ ] User environment fully functional with all packages
|
||||
- [ ] Modern CLI tools and aliases working
|
||||
- [ ] Console theming preserved
|
||||
- [ ] Virtualization stack operational
|
||||
- [ ] Literate dotfiles workflow established
|
||||
- [ ] Ready for multi-machine expansion
|
||||
- [ ] Development workflow improved
|
||||
- [ ] Documentation complete for future reference
|
||||
### Core Infrastructure ✅ FULLY ACHIEVED 🎉
|
||||
- [x] System boots reliably with flake configuration
|
||||
- [x] All current functionality preserved
|
||||
- [x] NixOS 25.05 running stable across all machines
|
||||
- [x] Configuration is modular and maintainable
|
||||
- [x] User environment fully functional with all packages
|
||||
- [x] Modern CLI tools and aliases working
|
||||
- [x] Console theming preserved
|
||||
- [x] Virtualization stack operational
|
||||
- [x] **Multi-machine expansion completed (4/4 machines deployed)**
|
||||
- [x] Development workflow improved with Git hosting
|
||||
|
||||
### Service Architecture ✅ FULLY ACHIEVED 🚀
|
||||
- [x] NFS file server operational with reliable network access via Tailscale
|
||||
- [x] Git hosting with external access via reverse proxy (git.geokkjer.eu)
|
||||
- [x] Media services with shared storage backend (Jellyfin + 903GB library)
|
||||
- [x] E-book and audiobook management services operational
|
||||
- [x] Secure external access with SSL termination and SSH forwarding
|
||||
- [x] Network mesh connectivity with centralized hostname resolution
|
||||
- [x] **All services verified operational and accessible externally**
|
||||
|
||||
### Network Integration ✅ FULLY ACHIEVED 🌐
|
||||
- [x] Tailscale mesh network connecting all infrastructure machines
|
||||
- [x] Centralized hostname resolution via extraHosts module
|
||||
- [x] NFS file sharing working reliably over network
|
||||
- [x] SSH key management with role-based access patterns
|
||||
- [x] Firewall configuration properly securing all services
|
||||
- [x] **External domain (git.geokkjer.eu) with SSL certificates working**
|
||||
|
||||
### Outstanding Enhancement Goals 🔄
|
||||
- [ ] Literate dotfiles workflow established with org-mode
|
||||
- [ ] Documentation complete for future reference and blog writeup
|
||||
- [ ] System monitoring and alerting infrastructure (Prometheus/Grafana)
|
||||
- [ ] Automated deployment and maintenance improvements
|
||||
- [ ] Automated backup services for critical data
|
||||
- [ ] User ID consistency cleanup across machines
|
||||
|
||||
## Infrastructure Notes
|
||||
|
||||
|
@ -610,10 +873,10 @@ Home-lab/
|
|||
- **Hardware**: Intel Xeon E5-2670 v3 (24 cores) @ 3.10 GHz, 31.24 GiB RAM
|
||||
- **Primary Mission**: Forgejo Git hosting and project management
|
||||
- **Performance**: Excellent specs for heavy containerized workloads and CI/CD
|
||||
- Container-focused architecture using Podman
|
||||
- PostgreSQL database for Forgejo
|
||||
- Concurrent multi-service deployment capability
|
||||
- Secondary services: Jellyfin (with transcoding), Nextcloud, Grafana
|
||||
- **Container-focused architecture** using Podman
|
||||
- **PostgreSQL database** for Forgejo
|
||||
- **Concurrent multi-service deployment capability**
|
||||
- **Secondary services**: Jellyfin (with transcoding), Nextcloud, Grafana
|
||||
- Integration hub for all home lab development projects
|
||||
- Culture name fits: "versatile ship handling varied, ambiguous tasks"
|
||||
- Central point for CI/CD pipelines and automation
|
||||
|
@ -624,3 +887,67 @@ Home-lab/
|
|||
- Modular NixOS configuration allows easy machine additions
|
||||
- Per-user dotfiles structure scales across multiple machines
|
||||
- Tailscale provides secure network foundation for multi-machine setup
|
||||
|
||||
#### Recent Critical Issue Resolution (December 2024) 🔧
|
||||
|
||||
**NFS Mount and Service Integration Issues - RESOLVED**
|
||||
|
||||
1. **NFS Dataset Structure Resolution**:
|
||||
- **Problem**: grey-area couldn't access media files via NFS mount
|
||||
- **Root Cause**: ZFS dataset structure confusion - mounting `/mnt/storage` vs `/mnt/storage/media`
|
||||
- **Solution**: Updated grey-area NFS mount from `sleeper-service:/mnt/storage` to `sleeper-service:/mnt/storage/media`
|
||||
- **Result**: All media content now accessible (38 movies, 29 TV series, 9 music albums, 79 audiobooks)
|
||||
|
||||
2. **NFS Network Export Configuration**:
|
||||
- **Problem**: NFS exports only configured for local network (10.0.0.0/24)
|
||||
- **Root Cause**: Missing Tailscale network access in NFS exports
|
||||
- **Solution**: Updated sleeper-service NFS exports to include Tailscale network (100.64.0.0/10)
|
||||
- **Result**: Reliable NFS connectivity over Tailscale mesh network
|
||||
|
||||
3. **Conflicting tmpfiles Rules**:
|
||||
- **Problem**: systemd tmpfiles creating conflicting directory structures for NFS exports
|
||||
- **Root Cause**: tmpfiles.d rules interfering with ZFS dataset mounting
|
||||
- **Solution**: Commented out conflicting tmpfiles rules in sleeper-service configuration
|
||||
- **Result**: Clean NFS export structure without mounting conflicts
|
||||
|
||||
4. **Forgejo Service Accessibility**:
|
||||
- **Problem**: git.geokkjer.eu returning connection refused errors
|
||||
- **Root Cause**: Multiple issues - firewall ports, hostname mapping, SSH forwarding
|
||||
- **Solutions Applied**:
|
||||
- Added port 3000 to grey-area firewall configuration
|
||||
- Fixed reverse-proxy nginx configuration: `http://apps:3000` → `http://grey-area:3000`
|
||||
- Updated SSH forwarding: `apps:22` → `grey-area:22` for port 1337
|
||||
- **Result**: External access verified - `curl -I https://git.geokkjer.eu` returns HTTP/2 200
|
||||
|
||||
5. **Hostname Resolution Consistency**:
|
||||
- **Problem**: Inconsistent hostname references across configurations ("apps" vs "grey-area")
|
||||
- **Root Cause**: Legacy hostname references in reverse-proxy configuration
|
||||
- **Solution**: Updated all configurations to use consistent "grey-area" hostname
|
||||
- **Result**: Proper service discovery and reverse proxy routing
|
||||
|
||||
6. **User ID Consistency Challenge**:
|
||||
- **Current State**: sma user has UID 1003 on grey-area vs 1001 on sleeper-service
|
||||
- **Workaround**: NFS access working via group permissions (users group: GID 100)
|
||||
- **Future Fix**: Implement centralized UID management across all machines
|
||||
|
||||
#### Recent Troubleshooting & Solutions (June 2025):
|
||||
8. **NFS Dataset Structure**: Proper understanding of ZFS dataset hierarchy crucial for NFS exports
|
||||
- `/mnt/storage` vs `/mnt/storage/media` dataset mounting differences
|
||||
- NFS exports must match actual ZFS dataset structure, not subdirectories
|
||||
- Client mount paths must align with server export paths for data access
|
||||
9. **Network Transition Management**: Tailscale vs local network connectivity during deployment
|
||||
- NFS exports need both local (10.0.0.0/24) and Tailscale (100.64.0.0/10) network access
|
||||
- extraHosts module provides consistent hostname resolution across network changes
|
||||
- Firewall configuration must accommodate service ports for external access
|
||||
10. **Reverse Proxy Configuration**: Hostname consistency critical for proxy functionality
|
||||
- nginx upstream configuration must use correct hostnames (grey-area not apps)
|
||||
- Service discovery relies on centralized hostname resolution modules
|
||||
- SSL certificate management works seamlessly with proper nginx configuration
|
||||
11. **Service Integration**: Multi-machine service architecture requires coordinated configuration
|
||||
- Forgejo deployment spans grey-area (service) + reverse-proxy (gateway) + DNS (domain)
|
||||
- NFS client/server coordination requires matching export/mount configurations
|
||||
- User ID consistency across machines essential for NFS file access permissions
|
||||
12. **Firewall Management**: Service-specific port configuration essential for functionality
|
||||
- Application servers need service ports opened (3000 for Forgejo, etc.)
|
||||
- Reverse proxy needs external ports (80, 443, 1337) and internal connectivity
|
||||
- SSH access coordination between local and Tailscale networks for security
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue