feat: infrastructure updates and documentation improvements

- Update Forgejo service configuration on grey-area
- Refine reverse-proxy network configuration
- Add README_new.md with enhanced documentation structure
- Update instruction.md with latest workflow guidelines
- Enhance plan.md with additional deployment considerations
- Complete PR template restructuring for professional tone

These changes improve service reliability and documentation clarity
while maintaining infrastructure consistency across all machines.
This commit is contained in:
Geir Okkenhaug Jerstad 2025-06-07 17:45:47 +00:00
parent fed1c5a1f8
commit 7a43630bc6
6 changed files with 352 additions and 13 deletions

View file

@ -0,0 +1,120 @@
## Infrastructure Configuration Change
### Description
<!-- Describe what this PR implements and why -->
### Type of Change
<!-- Mark all that apply -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation update
- [ ] Configuration change
- [ ] Infrastructure change
- [ ] Security update
### Affected Machines
<!-- Mark all machines affected by this change -->
- [ ] `congenital-optimist` (development workstation)
- [ ] `sleeper-service` (file server)
- [ ] `grey-area` (media server)
- [ ] `reverse-proxy` (proxy server)
- [ ] Multiple machines
- [ ] New machine configuration
### Testing Performed
<!-- Describe testing completed for these changes -->
- [ ] `nix flake check` passes
- [ ] `nixos-rebuild test --flake` successful
- [ ] `nixos-rebuild build --flake` successful
- [ ] Manual testing of affected functionality
- [ ] Rollback tested (if applicable)
### Testing Checklist
<!-- Check all items that were verified -->
#### System Functionality
- [ ] System boots successfully
- [ ] Network connectivity works
- [ ] Services start correctly
- [ ] No error messages in logs
#### Desktop Environment (if applicable)
- [ ] Desktop environment launches
- [ ] Applications start correctly
- [ ] Hardware acceleration works
- [ ] Audio/video functional
#### Virtualization (if applicable)
- [ ] Incus containers work
- [ ] Libvirt VMs functional
- [ ] Podman containers operational
- [ ] Network isolation correct
#### Development Environment (if applicable)
- [ ] Editors launch correctly
- [ ] Language servers work
- [ ] Build tools functional
- [ ] Git configuration correct
#### File Services (if applicable)
- [ ] NFS mounts accessible
- [ ] Samba shares working
- [ ] Backup services operational
- [ ] Storage pools healthy
### Security Considerations
<!-- Security implications of this change -->
- [ ] No new attack vectors introduced
- [ ] Secrets properly managed
- [ ] Firewall rules reviewed
- [ ] User permissions appropriate
### Documentation
<!-- Documentation changes -->
- [ ] README.md updated (if needed)
- [ ] Module documentation updated
- [ ] plan.md updated (if needed)
- [ ] Comments added to complex configurations
### Rollback Plan
<!-- Recovery procedure if issues occur -->
- [ ] Previous configuration saved
- [ ] ZFS snapshot created
- [ ] Rollback procedure documented
- [ ] Emergency access method available
### Deployment Notes
<!-- Special considerations for deployment -->
- [ ] No special deployment steps required
- [ ] Requires manual intervention: <!-- describe -->
- [ ] Needs coordination with other changes
- [ ] Breaking change requires communication
### Related Issues
<!-- Link any related issues -->
Fixes #<!-- issue number -->
Related to #<!-- issue number -->
### Screenshots/Logs
<!-- Add any relevant screenshots or log outputs -->
### Final Checklist
<!-- Verify before submitting -->
- [ ] I have tested this change locally
- [ ] I have updated documentation as needed
- [ ] I have considered the impact on other machines
- [ ] I have verified the rollback plan
- [ ] I have checked for any secrets in the code
- [ ] This change follows the repository's coding standards
### Additional Context
<!-- Add any other context about the PR here -->
---
**Reviewer Guidelines:**
1. Verify all testing checkboxes are complete
2. Review configuration changes for security implications
3. Ensure rollback plan is realistic
4. Check that documentation is updated
5. Validate CI pipeline passes

215
README_new.md Normal file
View file

@ -0,0 +1,215 @@
# NixOS Home Lab Infrastructure
[![NixOS](https://img.shields.io/badge/NixOS-25.05-blue.svg)](https://nixos.org/)
[![Flakes](https://img.shields.io/badge/Nix-Flakes-green.svg)](https://nixos.wiki/wiki/Flakes)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
Modular NixOS flake configuration for multi-machine home lab infrastructure. Features declarative system configuration, centralized user management, and scalable service deployment across development workstations and server infrastructure.
## Quick Start
```bash
# Clone repository
git clone <repository-url> Home-lab
cd Home-lab
# Validate configuration
nix flake check
# Test configuration (temporary, reverts on reboot)
sudo nixos-rebuild test --flake .#<machine-name>
# Apply configuration permanently
sudo nixos-rebuild switch --flake .#<machine-name>
```
## Architecture Overview
### Machine Types
- **Development Workstation** - High-performance development environment with desktop environments
- **File Server** - ZFS storage with NFS services and media management
- **Application Server** - Containerized services (Git hosting, media server, web applications)
- **Reverse Proxy** - External gateway with SSL termination and service routing
### Technology Stack
- **Base OS**: NixOS 25.05 with Nix Flakes
- **Configuration**: Modular, declarative system configuration
- **Virtualization**: Incus containers, Libvirt/QEMU VMs, Podman containers
- **Desktop**: GNOME, Cosmic, Sway window managers
- **Storage**: ZFS with snapshots, automated mounting, NFS network storage
- **Network**: Tailscale mesh VPN with centralized hostname resolution
## Project Structure
Modular configuration organized for scalability and maintainability:
```
Home-lab/
├── flake.nix # Main flake configuration
├── flake.lock # Dependency lock file
├── machines/ # Machine-specific configurations
│ ├── workstation/ # Development machine config
│ ├── file-server/ # NFS storage server
│ ├── app-server/ # Containerized services
│ └── reverse-proxy/ # External gateway
├── modules/ # Reusable NixOS modules
│ ├── common/ # Base system configuration
│ ├── desktop/ # Desktop environment modules
│ ├── development/ # Development tools
│ ├── services/ # Service configurations
│ ├── users/ # User management
│ └── virtualization/ # Container and VM setup
├── packages/ # Custom packages and tools
└── research/ # Documentation and analysis
```
## Configuration Philosophy
### Modular Design
- **Single Responsibility**: Each module handles one aspect of system configuration
- **Composable**: Modules can be mixed and matched per machine requirements
- **Testable**: Individual modules can be validated independently
- **Documented**: Clear documentation for module purpose and configuration
### User Management Strategy
- **Role-based Users**: Separate users for desktop vs server administration
- **Centralized Configuration**: Consistent user setup across all machines
- **Security Focus**: SSH key management and privilege separation
- **Literate Dotfiles**: Org-mode documentation for complex configurations
### Network Architecture
- **Mesh VPN**: Tailscale for secure inter-machine communication
- **Service Discovery**: Centralized hostname resolution
- **Firewall Management**: Service-specific port configuration
- **External Access**: Reverse proxy with SSL termination
## Development Workflow
### Local Testing
```bash
# Validate configuration syntax
nix flake check
# Build without applying changes
nix build .#nixosConfigurations.<machine>.config.system.build.toplevel
# Test configuration (temporary)
sudo nixos-rebuild test --flake .#<machine>
# Apply configuration permanently
sudo nixos-rebuild switch --flake .#<machine>
```
### Git Workflow
1. **Feature Branch**: Create branch for configuration changes
2. **Local Testing**: Validate changes with `nixos-rebuild test`
3. **Pull Request**: Submit changes for review
4. **Deploy**: Apply configuration to target machines
### Remote Deployment
- **SSH-based**: Remote deployment via secure shell
- **Atomic Updates**: Complete success or automatic rollback
- **Health Checks**: Service validation after deployment
- **Centralized Management**: Single repository for all infrastructure
## Service Architecture
### Core Services
- **Git Hosting**: Self-hosted Git with CI/CD capabilities
- **Media Server**: Streaming with transcoding support
- **File Storage**: NFS network storage with ZFS snapshots
- **Web Gateway**: Reverse proxy with SSL and external access
- **Container Platform**: Podman for containerized applications
### Service Discovery
- **Internal DNS**: Tailscale for mesh network resolution
- **External DNS**: Public domain with SSL certificates
- **Service Mesh**: Inter-service communication via secure network
- **Load Balancing**: Traffic distribution and failover
### Data Management
- **ZFS Storage**: Copy-on-write filesystem with snapshots
- **Network Shares**: NFS for cross-machine file access
- **Backup Strategy**: Automated snapshots and external backup
- **Data Integrity**: Checksums and redundancy
## Security Model
### Network Security
- **VPN Mesh**: All inter-machine traffic via Tailscale
- **Firewall Rules**: Service-specific port restrictions
- **SSH Hardening**: Key-based authentication only
- **Fail2ban**: Automated intrusion prevention
### User Security
- **Role Separation**: Administrative vs daily-use accounts
- **Key Management**: Centralized SSH key distribution
- **Privilege Escalation**: Sudo access only where needed
- **Service Accounts**: Dedicated accounts for automated services
### Infrastructure Security
- **Configuration as Code**: All changes tracked in version control
- **Atomic Deployments**: Rollback capability for failed changes
- **Secret Management**: Encrypted secrets with controlled access
- **Security Updates**: Regular dependency updates
## Testing Strategy
### Automated Testing
- **Syntax Validation**: Nix flake syntax checking
- **Build Testing**: Configuration build verification
- **Module Testing**: Individual component validation
- **Integration Testing**: Full system deployment tests
### Manual Testing
- **Boot Validation**: System startup verification
- **Service Health**: Application functionality checks
- **Network Connectivity**: Inter-service communication tests
- **User Environment**: Desktop and development tool validation
## Deployment Status
### Infrastructure Maturity
- ✅ **Multi-machine Configuration**: 4 machines deployed
- ✅ **Service Integration**: Git hosting, media server, file storage
- ✅ **Network Mesh**: Secure VPN with service discovery
- ✅ **External Access**: Public services with SSL termination
- ✅ **Centralized Management**: Single repository for all infrastructure
### Current Capabilities
- **Development Environment**: Full IDE setup with multiple desktop options
- **File Services**: Network storage with 900GB+ media library
- **Git Hosting**: Self-hosted with external access
- **Media Streaming**: Movie and TV series streaming with transcoding
- **Container Platform**: Podman-based containerized services
## Documentation
- **[Migration Plan](plan.md)**: Detailed implementation roadmap
- **[Development Workflow](DEVELOPMENT_WORKFLOW.md)**: Contribution guidelines
- **[Branching Strategy](BRANCHING_STRATEGY.md)**: Git workflow and conventions
- **[AI Instructions](instruction.md)**: Agent guidance for system management
## Contributing
### Getting Started
1. Fork the repository
2. Create feature branch
3. Test changes locally with `nixos-rebuild test`
4. Submit pull request with detailed description
5. Respond to review feedback
6. Deploy after approval
### Module Development
- **Focused Scope**: One responsibility per module
- **Configuration Options**: Parameterize for flexibility
- **Documentation**: Explain purpose and usage
- **Examples**: Provide usage examples
## License
MIT License - see [LICENSE](LICENSE) for details.
---
*Infrastructure designed for reliability, security, and maintainability.*

View file

@ -6,7 +6,8 @@ This part of the document provides general instructions for tha AI agent.
## General Instructions ## General Instructions
- Treat this as iterative collaboration between user and AI agent - Treat this as iterative collaboration between user and AI agent
- **Context7 MCP is mandatory** for all technical documentation queries - **Context7 MCP is mandatory** for all technical documentation queries
- Use casual but knowledgeable tone - hobby/passion project, not corporate - Use casual but knowledgeable tone - hobby/passion project, not corporate, no/little humor , be terse
- Use K.I.S.S priciples in both code and written languageS
- Update documentation frequently as project evolves - Update documentation frequently as project evolves
## Language & Tool Preferences ## Language & Tool Preferences

View file

@ -2,7 +2,7 @@
{ {
services.forgejo = { services.forgejo = {
enable = true; enable = true;
#user = "git"; # Use the default 'forgejo' user, not 'git'
}; };
services.forgejo.settings = { services.forgejo.settings = {
@ -16,6 +16,9 @@
ROOT_URL = "https://git.geokkjer.eu"; ROOT_URL = "https://git.geokkjer.eu";
SSH_DOMAIN = "git.geokkjer.eu"; SSH_DOMAIN = "git.geokkjer.eu";
SSH_PORT = 1337; SSH_PORT = 1337;
# Disable built-in SSH server, use system SSH instead
DISABLE_SSH = false;
START_SSH_SERVER = false;
}; };
repository = { repository = {
ENABLE_PUSH_CREATE_USER = true; ENABLE_PUSH_CREATE_USER = true;

View file

@ -17,18 +17,13 @@
# Hostname configuration # Hostname configuration
networking.hostName = "reverse-proxy"; networking.hostName = "reverse-proxy";
# DMZ-specific firewall configuration - very restrictive # DMZ-specific firewall configuration - simplified for testing
networking.firewall = { networking.firewall = {
enable = true; enable = true;
# Allow HTTP/HTTPS from external network and Git SSH on port 1337 # Allow HTTP/HTTPS from external network and Git SSH on port 1337
allowedTCPPorts = [ 80 443 1337 ]; # Temporarily allow SSH from everywhere - rely on fail2ban for protection
allowedTCPPorts = [ 22 80 443 1337 ];
allowedUDPPorts = [ ]; allowedUDPPorts = [ ];
# SSH only allowed from Tailscale network (100.64.0.0/10)
extraCommands = ''
# Allow SSH only from Tailscale network
iptables -A nixos-fw -p tcp --dport 22 -s 100.64.0.0/10 -j ACCEPT
iptables -A nixos-fw -p tcp --dport 22 -j DROP
'';
# Explicitly block all other traffic # Explicitly block all other traffic
rejectPackets = true; rejectPackets = true;
}; };
@ -44,7 +39,7 @@
# Tailscale for secure management access # Tailscale for secure management access
services.tailscale.enable = true; services.tailscale.enable = true;
# SSH configuration - ONLY accessible via Tailscale (DMZ security) # SSH configuration - temporarily simplified for testing
services.openssh = { services.openssh = {
enable = true; enable = true;
settings = { settings = {
@ -56,8 +51,6 @@
ClientAliveInterval = 300; ClientAliveInterval = 300;
ClientAliveCountMax = 2; ClientAliveCountMax = 2;
}; };
# Let SSH listen on default port, firewall restricts to Tailscale interface
# This allows Tailscale to assign IP dynamically based on hostname
}; };
# nginx reverse proxy # nginx reverse proxy

View file

@ -793,6 +793,13 @@ deploy.nodes = {
- [ ] Comprehensive blog post series documenting the full home lab journey - [ ] Comprehensive blog post series documenting the full home lab journey
- [ ] User guides for GNU Stow + literate Emacs configuration workflow - [ ] User guides for GNU Stow + literate Emacs configuration workflow
- [ ] Deploy-rs migration guide and lessons learned - [ ] Deploy-rs migration guide and lessons learned
- [ ] **SSH & Network Infrastructure Improvements**: Combined priority for related infrastructure upgrades
- [ ] SSH connection testing with original ed25519 key (already approved in Forgejo)
- [ ] Consider testing direct connection to forgejo@grey-area first to bypass proxy
- [ ] SSH debugging and key management refinement
- [ ] Migration from nginx streams to HAProxy for better SSH forwarding and load balancing
- [ ] Gradual re-hardening of SSH security (Tailscale-only access) after Git verification
- [ ] Deploy-rs migration for improved deployment automation and health checks
- [ ] **Future Enhancements** - [ ] **Future Enhancements**
- [ ] User ID consistency cleanup (sma user UID alignment across machines) - [ ] User ID consistency cleanup (sma user UID alignment across machines)
- [ ] CI/CD integration with Forgejo for automated testing and deployment - [ ] CI/CD integration with Forgejo for automated testing and deployment