feat: infrastructure updates and documentation improvements
- Update Forgejo service configuration on grey-area - Refine reverse-proxy network configuration - Add README_new.md with enhanced documentation structure - Update instruction.md with latest workflow guidelines - Enhance plan.md with additional deployment considerations - Complete PR template restructuring for professional tone These changes improve service reliability and documentation clarity while maintaining infrastructure consistency across all machines.
This commit is contained in:
parent
fed1c5a1f8
commit
7a43630bc6
6 changed files with 352 additions and 13 deletions
120
.github/PULL_REQUEST_TEMPLATE/home-lab-config.md
vendored
120
.github/PULL_REQUEST_TEMPLATE/home-lab-config.md
vendored
|
@ -0,0 +1,120 @@
|
||||||
|
## Infrastructure Configuration Change
|
||||||
|
|
||||||
|
### Description
|
||||||
|
<!-- Describe what this PR implements and why -->
|
||||||
|
|
||||||
|
### Type of Change
|
||||||
|
<!-- Mark all that apply -->
|
||||||
|
- [ ] Bug fix (non-breaking change that fixes an issue)
|
||||||
|
- [ ] New feature (non-breaking change that adds functionality)
|
||||||
|
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
|
||||||
|
- [ ] Documentation update
|
||||||
|
- [ ] Configuration change
|
||||||
|
- [ ] Infrastructure change
|
||||||
|
- [ ] Security update
|
||||||
|
|
||||||
|
### Affected Machines
|
||||||
|
<!-- Mark all machines affected by this change -->
|
||||||
|
- [ ] `congenital-optimist` (development workstation)
|
||||||
|
- [ ] `sleeper-service` (file server)
|
||||||
|
- [ ] `grey-area` (media server)
|
||||||
|
- [ ] `reverse-proxy` (proxy server)
|
||||||
|
- [ ] Multiple machines
|
||||||
|
- [ ] New machine configuration
|
||||||
|
|
||||||
|
### Testing Performed
|
||||||
|
<!-- Describe testing completed for these changes -->
|
||||||
|
- [ ] `nix flake check` passes
|
||||||
|
- [ ] `nixos-rebuild test --flake` successful
|
||||||
|
- [ ] `nixos-rebuild build --flake` successful
|
||||||
|
- [ ] Manual testing of affected functionality
|
||||||
|
- [ ] Rollback tested (if applicable)
|
||||||
|
|
||||||
|
### Testing Checklist
|
||||||
|
<!-- Check all items that were verified -->
|
||||||
|
#### System Functionality
|
||||||
|
- [ ] System boots successfully
|
||||||
|
- [ ] Network connectivity works
|
||||||
|
- [ ] Services start correctly
|
||||||
|
- [ ] No error messages in logs
|
||||||
|
|
||||||
|
#### Desktop Environment (if applicable)
|
||||||
|
- [ ] Desktop environment launches
|
||||||
|
- [ ] Applications start correctly
|
||||||
|
- [ ] Hardware acceleration works
|
||||||
|
- [ ] Audio/video functional
|
||||||
|
|
||||||
|
#### Virtualization (if applicable)
|
||||||
|
- [ ] Incus containers work
|
||||||
|
- [ ] Libvirt VMs functional
|
||||||
|
- [ ] Podman containers operational
|
||||||
|
- [ ] Network isolation correct
|
||||||
|
|
||||||
|
#### Development Environment (if applicable)
|
||||||
|
- [ ] Editors launch correctly
|
||||||
|
- [ ] Language servers work
|
||||||
|
- [ ] Build tools functional
|
||||||
|
- [ ] Git configuration correct
|
||||||
|
|
||||||
|
#### File Services (if applicable)
|
||||||
|
- [ ] NFS mounts accessible
|
||||||
|
- [ ] Samba shares working
|
||||||
|
- [ ] Backup services operational
|
||||||
|
- [ ] Storage pools healthy
|
||||||
|
|
||||||
|
### Security Considerations
|
||||||
|
<!-- Security implications of this change -->
|
||||||
|
- [ ] No new attack vectors introduced
|
||||||
|
- [ ] Secrets properly managed
|
||||||
|
- [ ] Firewall rules reviewed
|
||||||
|
- [ ] User permissions appropriate
|
||||||
|
|
||||||
|
### Documentation
|
||||||
|
<!-- Documentation changes -->
|
||||||
|
- [ ] README.md updated (if needed)
|
||||||
|
- [ ] Module documentation updated
|
||||||
|
- [ ] plan.md updated (if needed)
|
||||||
|
- [ ] Comments added to complex configurations
|
||||||
|
|
||||||
|
### Rollback Plan
|
||||||
|
<!-- Recovery procedure if issues occur -->
|
||||||
|
- [ ] Previous configuration saved
|
||||||
|
- [ ] ZFS snapshot created
|
||||||
|
- [ ] Rollback procedure documented
|
||||||
|
- [ ] Emergency access method available
|
||||||
|
|
||||||
|
### Deployment Notes
|
||||||
|
<!-- Special considerations for deployment -->
|
||||||
|
- [ ] No special deployment steps required
|
||||||
|
- [ ] Requires manual intervention: <!-- describe -->
|
||||||
|
- [ ] Needs coordination with other changes
|
||||||
|
- [ ] Breaking change requires communication
|
||||||
|
|
||||||
|
### Related Issues
|
||||||
|
<!-- Link any related issues -->
|
||||||
|
Fixes #<!-- issue number -->
|
||||||
|
Related to #<!-- issue number -->
|
||||||
|
|
||||||
|
### Screenshots/Logs
|
||||||
|
<!-- Add any relevant screenshots or log outputs -->
|
||||||
|
|
||||||
|
### Final Checklist
|
||||||
|
<!-- Verify before submitting -->
|
||||||
|
- [ ] I have tested this change locally
|
||||||
|
- [ ] I have updated documentation as needed
|
||||||
|
- [ ] I have considered the impact on other machines
|
||||||
|
- [ ] I have verified the rollback plan
|
||||||
|
- [ ] I have checked for any secrets in the code
|
||||||
|
- [ ] This change follows the repository's coding standards
|
||||||
|
|
||||||
|
### Additional Context
|
||||||
|
<!-- Add any other context about the PR here -->
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Reviewer Guidelines:**
|
||||||
|
1. Verify all testing checkboxes are complete
|
||||||
|
2. Review configuration changes for security implications
|
||||||
|
3. Ensure rollback plan is realistic
|
||||||
|
4. Check that documentation is updated
|
||||||
|
5. Validate CI pipeline passes
|
215
README_new.md
Normal file
215
README_new.md
Normal file
|
@ -0,0 +1,215 @@
|
||||||
|
# NixOS Home Lab Infrastructure
|
||||||
|
|
||||||
|
[](https://nixos.org/)
|
||||||
|
[](https://nixos.wiki/wiki/Flakes)
|
||||||
|
[](LICENSE)
|
||||||
|
|
||||||
|
Modular NixOS flake configuration for multi-machine home lab infrastructure. Features declarative system configuration, centralized user management, and scalable service deployment across development workstations and server infrastructure.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone repository
|
||||||
|
git clone <repository-url> Home-lab
|
||||||
|
cd Home-lab
|
||||||
|
|
||||||
|
# Validate configuration
|
||||||
|
nix flake check
|
||||||
|
|
||||||
|
# Test configuration (temporary, reverts on reboot)
|
||||||
|
sudo nixos-rebuild test --flake .#<machine-name>
|
||||||
|
|
||||||
|
# Apply configuration permanently
|
||||||
|
sudo nixos-rebuild switch --flake .#<machine-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
### Machine Types
|
||||||
|
- **Development Workstation** - High-performance development environment with desktop environments
|
||||||
|
- **File Server** - ZFS storage with NFS services and media management
|
||||||
|
- **Application Server** - Containerized services (Git hosting, media server, web applications)
|
||||||
|
- **Reverse Proxy** - External gateway with SSL termination and service routing
|
||||||
|
|
||||||
|
### Technology Stack
|
||||||
|
- **Base OS**: NixOS 25.05 with Nix Flakes
|
||||||
|
- **Configuration**: Modular, declarative system configuration
|
||||||
|
- **Virtualization**: Incus containers, Libvirt/QEMU VMs, Podman containers
|
||||||
|
- **Desktop**: GNOME, Cosmic, Sway window managers
|
||||||
|
- **Storage**: ZFS with snapshots, automated mounting, NFS network storage
|
||||||
|
- **Network**: Tailscale mesh VPN with centralized hostname resolution
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
Modular configuration organized for scalability and maintainability:
|
||||||
|
|
||||||
|
```
|
||||||
|
Home-lab/
|
||||||
|
├── flake.nix # Main flake configuration
|
||||||
|
├── flake.lock # Dependency lock file
|
||||||
|
├── machines/ # Machine-specific configurations
|
||||||
|
│ ├── workstation/ # Development machine config
|
||||||
|
│ ├── file-server/ # NFS storage server
|
||||||
|
│ ├── app-server/ # Containerized services
|
||||||
|
│ └── reverse-proxy/ # External gateway
|
||||||
|
├── modules/ # Reusable NixOS modules
|
||||||
|
│ ├── common/ # Base system configuration
|
||||||
|
│ ├── desktop/ # Desktop environment modules
|
||||||
|
│ ├── development/ # Development tools
|
||||||
|
│ ├── services/ # Service configurations
|
||||||
|
│ ├── users/ # User management
|
||||||
|
│ └── virtualization/ # Container and VM setup
|
||||||
|
├── packages/ # Custom packages and tools
|
||||||
|
└── research/ # Documentation and analysis
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration Philosophy
|
||||||
|
|
||||||
|
### Modular Design
|
||||||
|
- **Single Responsibility**: Each module handles one aspect of system configuration
|
||||||
|
- **Composable**: Modules can be mixed and matched per machine requirements
|
||||||
|
- **Testable**: Individual modules can be validated independently
|
||||||
|
- **Documented**: Clear documentation for module purpose and configuration
|
||||||
|
|
||||||
|
### User Management Strategy
|
||||||
|
- **Role-based Users**: Separate users for desktop vs server administration
|
||||||
|
- **Centralized Configuration**: Consistent user setup across all machines
|
||||||
|
- **Security Focus**: SSH key management and privilege separation
|
||||||
|
- **Literate Dotfiles**: Org-mode documentation for complex configurations
|
||||||
|
|
||||||
|
### Network Architecture
|
||||||
|
- **Mesh VPN**: Tailscale for secure inter-machine communication
|
||||||
|
- **Service Discovery**: Centralized hostname resolution
|
||||||
|
- **Firewall Management**: Service-specific port configuration
|
||||||
|
- **External Access**: Reverse proxy with SSL termination
|
||||||
|
|
||||||
|
## Development Workflow
|
||||||
|
|
||||||
|
### Local Testing
|
||||||
|
```bash
|
||||||
|
# Validate configuration syntax
|
||||||
|
nix flake check
|
||||||
|
|
||||||
|
# Build without applying changes
|
||||||
|
nix build .#nixosConfigurations.<machine>.config.system.build.toplevel
|
||||||
|
|
||||||
|
# Test configuration (temporary)
|
||||||
|
sudo nixos-rebuild test --flake .#<machine>
|
||||||
|
|
||||||
|
# Apply configuration permanently
|
||||||
|
sudo nixos-rebuild switch --flake .#<machine>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Git Workflow
|
||||||
|
1. **Feature Branch**: Create branch for configuration changes
|
||||||
|
2. **Local Testing**: Validate changes with `nixos-rebuild test`
|
||||||
|
3. **Pull Request**: Submit changes for review
|
||||||
|
4. **Deploy**: Apply configuration to target machines
|
||||||
|
|
||||||
|
### Remote Deployment
|
||||||
|
- **SSH-based**: Remote deployment via secure shell
|
||||||
|
- **Atomic Updates**: Complete success or automatic rollback
|
||||||
|
- **Health Checks**: Service validation after deployment
|
||||||
|
- **Centralized Management**: Single repository for all infrastructure
|
||||||
|
|
||||||
|
## Service Architecture
|
||||||
|
|
||||||
|
### Core Services
|
||||||
|
- **Git Hosting**: Self-hosted Git with CI/CD capabilities
|
||||||
|
- **Media Server**: Streaming with transcoding support
|
||||||
|
- **File Storage**: NFS network storage with ZFS snapshots
|
||||||
|
- **Web Gateway**: Reverse proxy with SSL and external access
|
||||||
|
- **Container Platform**: Podman for containerized applications
|
||||||
|
|
||||||
|
### Service Discovery
|
||||||
|
- **Internal DNS**: Tailscale for mesh network resolution
|
||||||
|
- **External DNS**: Public domain with SSL certificates
|
||||||
|
- **Service Mesh**: Inter-service communication via secure network
|
||||||
|
- **Load Balancing**: Traffic distribution and failover
|
||||||
|
|
||||||
|
### Data Management
|
||||||
|
- **ZFS Storage**: Copy-on-write filesystem with snapshots
|
||||||
|
- **Network Shares**: NFS for cross-machine file access
|
||||||
|
- **Backup Strategy**: Automated snapshots and external backup
|
||||||
|
- **Data Integrity**: Checksums and redundancy
|
||||||
|
|
||||||
|
## Security Model
|
||||||
|
|
||||||
|
### Network Security
|
||||||
|
- **VPN Mesh**: All inter-machine traffic via Tailscale
|
||||||
|
- **Firewall Rules**: Service-specific port restrictions
|
||||||
|
- **SSH Hardening**: Key-based authentication only
|
||||||
|
- **Fail2ban**: Automated intrusion prevention
|
||||||
|
|
||||||
|
### User Security
|
||||||
|
- **Role Separation**: Administrative vs daily-use accounts
|
||||||
|
- **Key Management**: Centralized SSH key distribution
|
||||||
|
- **Privilege Escalation**: Sudo access only where needed
|
||||||
|
- **Service Accounts**: Dedicated accounts for automated services
|
||||||
|
|
||||||
|
### Infrastructure Security
|
||||||
|
- **Configuration as Code**: All changes tracked in version control
|
||||||
|
- **Atomic Deployments**: Rollback capability for failed changes
|
||||||
|
- **Secret Management**: Encrypted secrets with controlled access
|
||||||
|
- **Security Updates**: Regular dependency updates
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
### Automated Testing
|
||||||
|
- **Syntax Validation**: Nix flake syntax checking
|
||||||
|
- **Build Testing**: Configuration build verification
|
||||||
|
- **Module Testing**: Individual component validation
|
||||||
|
- **Integration Testing**: Full system deployment tests
|
||||||
|
|
||||||
|
### Manual Testing
|
||||||
|
- **Boot Validation**: System startup verification
|
||||||
|
- **Service Health**: Application functionality checks
|
||||||
|
- **Network Connectivity**: Inter-service communication tests
|
||||||
|
- **User Environment**: Desktop and development tool validation
|
||||||
|
|
||||||
|
## Deployment Status
|
||||||
|
|
||||||
|
### Infrastructure Maturity
|
||||||
|
- ✅ **Multi-machine Configuration**: 4 machines deployed
|
||||||
|
- ✅ **Service Integration**: Git hosting, media server, file storage
|
||||||
|
- ✅ **Network Mesh**: Secure VPN with service discovery
|
||||||
|
- ✅ **External Access**: Public services with SSL termination
|
||||||
|
- ✅ **Centralized Management**: Single repository for all infrastructure
|
||||||
|
|
||||||
|
### Current Capabilities
|
||||||
|
- **Development Environment**: Full IDE setup with multiple desktop options
|
||||||
|
- **File Services**: Network storage with 900GB+ media library
|
||||||
|
- **Git Hosting**: Self-hosted with external access
|
||||||
|
- **Media Streaming**: Movie and TV series streaming with transcoding
|
||||||
|
- **Container Platform**: Podman-based containerized services
|
||||||
|
|
||||||
|
## Documentation
|
||||||
|
|
||||||
|
- **[Migration Plan](plan.md)**: Detailed implementation roadmap
|
||||||
|
- **[Development Workflow](DEVELOPMENT_WORKFLOW.md)**: Contribution guidelines
|
||||||
|
- **[Branching Strategy](BRANCHING_STRATEGY.md)**: Git workflow and conventions
|
||||||
|
- **[AI Instructions](instruction.md)**: Agent guidance for system management
|
||||||
|
|
||||||
|
## Contributing
|
||||||
|
|
||||||
|
### Getting Started
|
||||||
|
1. Fork the repository
|
||||||
|
2. Create feature branch
|
||||||
|
3. Test changes locally with `nixos-rebuild test`
|
||||||
|
4. Submit pull request with detailed description
|
||||||
|
5. Respond to review feedback
|
||||||
|
6. Deploy after approval
|
||||||
|
|
||||||
|
### Module Development
|
||||||
|
- **Focused Scope**: One responsibility per module
|
||||||
|
- **Configuration Options**: Parameterize for flexibility
|
||||||
|
- **Documentation**: Explain purpose and usage
|
||||||
|
- **Examples**: Provide usage examples
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
MIT License - see [LICENSE](LICENSE) for details.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Infrastructure designed for reliability, security, and maintainability.*
|
|
@ -6,7 +6,8 @@ This part of the document provides general instructions for tha AI agent.
|
||||||
## General Instructions
|
## General Instructions
|
||||||
- Treat this as iterative collaboration between user and AI agent
|
- Treat this as iterative collaboration between user and AI agent
|
||||||
- **Context7 MCP is mandatory** for all technical documentation queries
|
- **Context7 MCP is mandatory** for all technical documentation queries
|
||||||
- Use casual but knowledgeable tone - hobby/passion project, not corporate
|
- Use casual but knowledgeable tone - hobby/passion project, not corporate, no/little humor , be terse
|
||||||
|
- Use K.I.S.S priciples in both code and written languageS
|
||||||
- Update documentation frequently as project evolves
|
- Update documentation frequently as project evolves
|
||||||
|
|
||||||
## Language & Tool Preferences
|
## Language & Tool Preferences
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
{
|
{
|
||||||
services.forgejo = {
|
services.forgejo = {
|
||||||
enable = true;
|
enable = true;
|
||||||
#user = "git";
|
# Use the default 'forgejo' user, not 'git'
|
||||||
};
|
};
|
||||||
|
|
||||||
services.forgejo.settings = {
|
services.forgejo.settings = {
|
||||||
|
@ -16,6 +16,9 @@
|
||||||
ROOT_URL = "https://git.geokkjer.eu";
|
ROOT_URL = "https://git.geokkjer.eu";
|
||||||
SSH_DOMAIN = "git.geokkjer.eu";
|
SSH_DOMAIN = "git.geokkjer.eu";
|
||||||
SSH_PORT = 1337;
|
SSH_PORT = 1337;
|
||||||
|
# Disable built-in SSH server, use system SSH instead
|
||||||
|
DISABLE_SSH = false;
|
||||||
|
START_SSH_SERVER = false;
|
||||||
};
|
};
|
||||||
repository = {
|
repository = {
|
||||||
ENABLE_PUSH_CREATE_USER = true;
|
ENABLE_PUSH_CREATE_USER = true;
|
||||||
|
|
|
@ -17,18 +17,13 @@
|
||||||
# Hostname configuration
|
# Hostname configuration
|
||||||
networking.hostName = "reverse-proxy";
|
networking.hostName = "reverse-proxy";
|
||||||
|
|
||||||
# DMZ-specific firewall configuration - very restrictive
|
# DMZ-specific firewall configuration - simplified for testing
|
||||||
networking.firewall = {
|
networking.firewall = {
|
||||||
enable = true;
|
enable = true;
|
||||||
# Allow HTTP/HTTPS from external network and Git SSH on port 1337
|
# Allow HTTP/HTTPS from external network and Git SSH on port 1337
|
||||||
allowedTCPPorts = [ 80 443 1337 ];
|
# Temporarily allow SSH from everywhere - rely on fail2ban for protection
|
||||||
|
allowedTCPPorts = [ 22 80 443 1337 ];
|
||||||
allowedUDPPorts = [ ];
|
allowedUDPPorts = [ ];
|
||||||
# SSH only allowed from Tailscale network (100.64.0.0/10)
|
|
||||||
extraCommands = ''
|
|
||||||
# Allow SSH only from Tailscale network
|
|
||||||
iptables -A nixos-fw -p tcp --dport 22 -s 100.64.0.0/10 -j ACCEPT
|
|
||||||
iptables -A nixos-fw -p tcp --dport 22 -j DROP
|
|
||||||
'';
|
|
||||||
# Explicitly block all other traffic
|
# Explicitly block all other traffic
|
||||||
rejectPackets = true;
|
rejectPackets = true;
|
||||||
};
|
};
|
||||||
|
@ -44,7 +39,7 @@
|
||||||
# Tailscale for secure management access
|
# Tailscale for secure management access
|
||||||
services.tailscale.enable = true;
|
services.tailscale.enable = true;
|
||||||
|
|
||||||
# SSH configuration - ONLY accessible via Tailscale (DMZ security)
|
# SSH configuration - temporarily simplified for testing
|
||||||
services.openssh = {
|
services.openssh = {
|
||||||
enable = true;
|
enable = true;
|
||||||
settings = {
|
settings = {
|
||||||
|
@ -56,8 +51,6 @@
|
||||||
ClientAliveInterval = 300;
|
ClientAliveInterval = 300;
|
||||||
ClientAliveCountMax = 2;
|
ClientAliveCountMax = 2;
|
||||||
};
|
};
|
||||||
# Let SSH listen on default port, firewall restricts to Tailscale interface
|
|
||||||
# This allows Tailscale to assign IP dynamically based on hostname
|
|
||||||
};
|
};
|
||||||
|
|
||||||
# nginx reverse proxy
|
# nginx reverse proxy
|
||||||
|
|
7
plan.md
7
plan.md
|
@ -793,6 +793,13 @@ deploy.nodes = {
|
||||||
- [ ] Comprehensive blog post series documenting the full home lab journey
|
- [ ] Comprehensive blog post series documenting the full home lab journey
|
||||||
- [ ] User guides for GNU Stow + literate Emacs configuration workflow
|
- [ ] User guides for GNU Stow + literate Emacs configuration workflow
|
||||||
- [ ] Deploy-rs migration guide and lessons learned
|
- [ ] Deploy-rs migration guide and lessons learned
|
||||||
|
- [ ] **SSH & Network Infrastructure Improvements**: Combined priority for related infrastructure upgrades
|
||||||
|
- [ ] SSH connection testing with original ed25519 key (already approved in Forgejo)
|
||||||
|
- [ ] Consider testing direct connection to forgejo@grey-area first to bypass proxy
|
||||||
|
- [ ] SSH debugging and key management refinement
|
||||||
|
- [ ] Migration from nginx streams to HAProxy for better SSH forwarding and load balancing
|
||||||
|
- [ ] Gradual re-hardening of SSH security (Tailscale-only access) after Git verification
|
||||||
|
- [ ] Deploy-rs migration for improved deployment automation and health checks
|
||||||
- [ ] **Future Enhancements**
|
- [ ] **Future Enhancements**
|
||||||
- [ ] User ID consistency cleanup (sma user UID alignment across machines)
|
- [ ] User ID consistency cleanup (sma user UID alignment across machines)
|
||||||
- [ ] CI/CD integration with Forgejo for automated testing and deployment
|
- [ ] CI/CD integration with Forgejo for automated testing and deployment
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue