Files
rick-infra/docs/deployment-guide.md
Joakim ecbeb07ba2 Migrate sigvild-gallery to production environment
- Add multi-environment architecture (homelab + production)
- Create production environment (mini-vps) for client projects
- Create homelab playbook for arch-vps services
- Create production playbook for mini-vps services
- Move sigvild-gallery from homelab to production
- Restructure variables: group_vars/production + host_vars/arch-vps
- Add backup-sigvild.yml playbook with auto-restore functionality
- Fix restore logic to check for data before creating directories
- Add manual variable loading workaround for Ansible 2.20
- Update all documentation for multi-environment setup
- Add ADR-007 documenting multi-environment architecture decision
2025-12-15 16:33:33 +01:00

539 lines
18 KiB
Markdown

# Deployment Guide
This guide explains how to deploy your infrastructure including core services, authentication, and applications.
## Overview
The rick-infra deployment system provides:
- **Native Infrastructure**: PostgreSQL, Valkey, Podman, Caddy managed by systemd
- **Authentication Services**: Authentik SSO with forward auth integration
- **Application Services**: Containerized services with Unix socket IPC
- **Security First**: All services deployed with security hardening by default
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ rick-infra Infrastructure Stack │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ │
│ │ Applications │ │ Authentication │ │ Reverse Proxy │ │
│ │ │ │ │ │ │ │
│ │ Gitea (Git) │ │ Authentik (SSO) │ │ Caddy (HTTPS) │ │
│ │ Gallery (Media) │ │ Forward Auth │ │ Auto TLS │ │
│ │ Custom Services │ │ OAuth2/OIDC │ │ Load Balance │ │
│ └─────────────────┘ └─────────────────┘ └───────────────┘ │
│ │ │ │ │
│ └────────────────────┼────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Infrastructure Services (Native systemd) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ │
│ │ │ PostgreSQL │ │ Valkey │ │ Podman │ │ │
│ │ │ (Database) │ │ (Cache) │ │ (Containers) │ │ │
│ │ │Unix Sockets │ │Unix Sockets │ │ Rootless │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Security Foundation │ │
│ │ SSH Hardening • Firewall • Fail2ban • Updates │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
## Infrastructure Overview
Rick-infra now manages **two separate environments**:
### Homelab (arch-vps)
Personal services and experimentation platform at **jnss.me**:
- PostgreSQL, Valkey, Podman infrastructure
- Caddy reverse proxy with auto-HTTPS
- Nextcloud (cloud.jnss.me)
- Authentik SSO (auth.jnss.me)
- Gitea (git.jnss.me)
### Production (mini-vps)
Client projects requiring high uptime:
- Caddy reverse proxy with auto-HTTPS
- Sigvild Gallery (sigvild.no, api.sigvild.no)
## Available Deployments
### 1. `site.yml` - Deploy All Environments
Deploys both homelab and production infrastructure.
```bash
ansible-playbook site.yml --ask-vault-pass
```
### 2. Environment-Specific Deployments
```bash
# Deploy only homelab services
ansible-playbook playbooks/homelab.yml --ask-vault-pass
# Deploy only production services
ansible-playbook playbooks/production.yml --ask-vault-pass
# Or use site.yml with limits
ansible-playbook site.yml -l homelab --ask-vault-pass
ansible-playbook site.yml -l production --ask-vault-pass
```
### 2. Service-Specific Deployments
Deploy individual components using tags:
```bash
# Deploy only infrastructure services
ansible-playbook site.yml --tags postgresql,valkey,podman,caddy --ask-vault-pass
# Deploy only authentication
ansible-playbook site.yml --tags authentik --ask-vault-pass
# Deploy only applications
ansible-playbook site.yml --tags gitea,sigvild-gallery --ask-vault-pass
```
### 3. Security-Only Deployment
```bash
# Deploy only security hardening
ansible-playbook playbooks/security.yml --ask-vault-pass
```
## Deployment Patterns
### First-Time Complete Deployment
⚠️ **Important**: First-time deployments include security hardening that may require a system reboot.
#### Prerequisites
1. **VPS Setup**: Fresh Arch Linux VPS with root access
2. **DNS Configuration**: Domain pointed to VPS IP address
3. **Vault Variables**: Required secrets configured (see [Vault Setup](#vault-variables))
4. **SSH Access**: Key-based authentication configured
#### Step-by-Step First Deployment
```bash
# 1. Test connectivity
ansible arch-vps -m ping
# 2. Deploy complete infrastructure stack
ansible-playbook -i inventory/hosts.yml site.yml --ask-vault-pass
# 3. Verify deployment
curl -I https://auth.jnss.me/
curl -I https://git.jnss.me/
```
**What happens during first deployment:**
1. **Security Hardening** (1-2 minutes)
- System package updates
- SSH configuration hardening
- nftables firewall setup
- fail2ban intrusion detection
- Kernel security parameters
- *May require reboot if kernel updated*
2. **Infrastructure Services** (2-3 minutes)
- PostgreSQL database with Unix sockets
- Valkey cache with Unix sockets
- Podman rootless container setup
- Caddy reverse proxy with auto-HTTPS
3. **Authentication Service** (3-5 minutes)
- Authentik container deployment
- Database and cache integration
- Forward auth configuration
- Admin user initialization
4. **Application Services** (2-4 minutes)
- Gitea Git service
- Gallery media service
- Service-specific configurations
**Total deployment time**: 8-14 minutes for complete stack
### Incremental Deployment
For subsequent deployments or updates:
```bash
# Deploy only changed components
ansible-playbook site.yml --tags authentik --ask-vault-pass
# Deploy with check mode first (dry run)
ansible-playbook site.yml --check --ask-vault-pass
# Deploy with verbose output for debugging
ansible-playbook site.yml -v --ask-vault-pass
```
## Authentication Deployment
### Authentik SSO Server
Rick-infra includes comprehensive authentication via Authentik. For detailed deployment instructions, see the [Authentik Deployment Guide](authentik-deployment-guide.md).
#### Quick Authentik Deployment
```bash
# Deploy authentik and all dependencies
ansible-playbook site.yml --tags authentik --ask-vault-pass
# Verify authentik deployment
curl -I https://auth.jnss.me/
ssh root@your-vps "systemctl --user -M authentik@ status authentik-pod"
```
#### Authentik Integration Pattern
All services in rick-infra use forward authentication through Caddy:
```caddyfile
# Example service with authentik protection
myservice.jnss.me {
forward_auth https://auth.jnss.me {
uri /outpost.goauthentik.io/auth/caddy
copy_headers Remote-User Remote-Name Remote-Email Remote-Groups
}
reverse_proxy localhost:8080
}
```
#### Post-Deployment Authentik Setup
1. **Access Admin Interface**: `https://auth.jnss.me/if/admin/`
2. **Default Admin**: Email from `authentik_default_admin_email` variable
3. **Initial Configuration**: Follow [Authentik Deployment Guide](authentik-deployment-guide.md#post-deployment-configuration)
For complete authentik architecture details, see [Architecture Decisions](architecture-decisions.md#adr-004-forward-authentication-security-model).
## Configuration Management
### Variable Organization
Rick-infra uses a hybrid approach for variable management:
**Group Variables** (`group_vars/`):
- `production/main.yml` - Production environment configuration
- `production/vault.yml` - Production secrets (encrypted)
**Host Variables** (`host_vars/`):
- `arch-vps/main.yml` - Homelab configuration
- `arch-vps/vault.yml` - Homelab secrets (encrypted)
**Note:** Due to variable loading issues in Ansible 2.20, playbooks manually load variables using `include_vars`. This ensures reliable variable resolution during execution.
### Example: Homelab Configuration
Core infrastructure settings in `host_vars/arch-vps/main.yml`:
```yaml
# TLS Configuration
caddy_tls_enabled: true
caddy_domain: "jnss.me"
caddy_tls_email: "{{ vault_caddy_tls_email }}"
# DNS Challenge
caddy_dns_provider: "cloudflare"
cloudflare_api_token: "{{ vault_cloudflare_api_token }}"
# API Configuration
caddy_api_enabled: true
caddy_server_name: "main"
# Logging
caddy_log_level: "INFO"
caddy_log_format: "json"
caddy_systemd_security: true
```
### Vault Variables
Sensitive data in `host_vars/arch-vps/vault.yml` (encrypted):
```yaml
# Infrastructure secrets
vault_caddy_tls_email: "admin@jnss.me"
vault_cloudflare_api_token: "your-cloudflare-token"
# Database passwords
vault_postgresql_password: "secure-postgres-password"
vault_valkey_password: "secure-valkey-password"
# Authentik authentication secrets
vault_authentik_secret_key: "base64-encoded-secret-key"
vault_authentik_admin_password: "secure-admin-password"
vault_authentik_db_password: "authentik-database-password"
# Application secrets
vault_gitea_secret_key: "gitea-secret-key"
vault_gitea_jwt_secret: "gitea-jwt-secret"
```
#### Generate Vault Secrets
```bash
# Generate secure passwords and keys
openssl rand -base64 32 # For secret keys
openssl rand -base64 20 # For passwords
# Encrypt vault file
ansible-vault encrypt host_vars/arch-vps/vault.yml
# Edit vault file
ansible-vault edit host_vars/arch-vps/vault.yml
```
## Monitoring and Verification
### Service Health Checks
```bash
# Infrastructure services
ssh root@your-vps "systemctl status postgresql valkey caddy podman"
# Authentication services
ssh root@your-vps "systemctl --user -M authentik@ status authentik-pod"
# Application services
ssh root@your-vps "systemctl --user -M gitea@ status gitea"
# Web service accessibility
curl -I https://auth.jnss.me/
curl -I https://git.jnss.me/
```
### Log Monitoring
```bash
# Infrastructure logs
ssh root@your-vps "journalctl -u caddy -f"
ssh root@your-vps "journalctl -u postgresql -f"
ssh root@your-vps "journalctl -u valkey -f"
# Authentication logs
ssh root@your-vps "journalctl --user -M authentik@ -u authentik-server -f"
# Application logs
ssh root@your-vps "journalctl --user -M gitea@ -u gitea -f"
```
### Performance Monitoring
```bash
# Container resource usage
ssh root@your-vps "sudo -u authentik podman stats"
ssh root@your-vps "sudo -u gitea podman stats"
# Database performance
ssh root@your-vps "sudo -u postgres psql -h /var/run/postgresql -c 'SELECT * FROM pg_stat_activity;'"
# System resources
ssh root@your-vps "htop"
ssh root@your-vps "df -h"
ssh root@your-vps "free -h"
```
## Security Best Practices
### Deployment Security
- **Always use vault**: Never commit secrets to version control
- **Verify connectivity**: Test Ansible connection before deployment
- **Monitor logs**: Watch deployment logs for errors or warnings
- **Validate services**: Verify all services start correctly after deployment
- **Check certificates**: Ensure HTTPS certificates are issued correctly
### Operational Security
- **Regular updates**: Keep system packages and container images updated
- **Access monitoring**: Monitor authentication logs for suspicious activity
- **Backup procedures**: Regular backups of databases and configurations
- **Certificate monitoring**: Monitor TLS certificate expiration
- **Security scans**: Regular security assessments of deployed services
### Network Security
- **Firewall verification**: Ensure only ports 80/443 are exposed externally
- **Unix socket usage**: Verify database/cache communications use Unix sockets
- **TLS encryption**: Confirm all external communications use HTTPS
- **Access logs**: Monitor Caddy access logs for unusual patterns
## Troubleshooting Common Issues
### Deployment Failures
#### Issue: Ansible vault decryption fails
```bash
# Solution: Verify vault password and file encryption
ansible-vault view host_vars/arch-vps/vault.yml
ansible-vault decrypt host_vars/arch-vps/vault.yml # Edit, then re-encrypt
ansible-vault encrypt host_vars/arch-vps/vault.yml
```
#### Issue: SSH connection failures
```bash
# Solution: Verify SSH configuration and keys
ssh-add -l # List SSH keys
ssh root@your-vps # Test direct connection
ansible arch-vps -m ping # Test Ansible connectivity
```
#### Issue: Container services fail to start
```bash
# Check container user sessions
ssh root@your-vps "loginctl list-users"
ssh root@your-vps "systemctl --user -M authentik@ status"
# Verify container images are available
ssh root@your-vps "sudo -u authentik podman images"
# Check container logs
ssh root@your-vps "sudo -u authentik podman logs authentik-server"
```
### Service Connectivity Issues
#### Issue: Database connection failures
```bash
# Verify PostgreSQL socket exists and is accessible
ssh root@your-vps "ls -la /var/run/postgresql/"
ssh root@your-vps "sudo -u authentik psql -h /var/run/postgresql -U authentik"
# Check PostgreSQL service status
ssh root@your-vps "systemctl status postgresql"
```
#### Issue: Authentication not working
```bash
# Check authentik service status
ssh root@your-vps "systemctl --user -M authentik@ status authentik-pod authentik-server"
# Test authentik HTTP endpoint
ssh root@your-vps "curl -I http://127.0.0.1:9000/"
# Check Caddy forward auth configuration
ssh root@your-vps "caddy validate --config /etc/caddy/Caddyfile"
```
#### Issue: HTTPS certificate problems
```bash
# Check certificate status
curl -vI https://auth.jnss.me/ 2>&1 | grep -E "(certificate|expire)"
# Check Caddy certificate management
ssh root@your-vps "journalctl -u caddy | grep -i cert"
# Verify DNS records
dig +short auth.jnss.me
```
### Performance Issues
#### Issue: Slow database queries
```bash
# Monitor active connections
ssh root@your-vps "sudo -u postgres psql -h /var/run/postgresql -c 'SELECT * FROM pg_stat_activity;'"
# Check database performance
ssh root@your-vps "sudo -u postgres psql -h /var/run/postgresql -c 'SELECT * FROM pg_stat_database;'"
```
#### Issue: High memory usage
```bash
# Check system memory
ssh root@your-vps "free -h"
# Check container memory usage
ssh root@your-vps "sudo -u authentik podman stats"
# Check service memory limits
ssh root@your-vps "systemctl --user -M authentik@ show authentik-server --property=MemoryCurrent"
```
## Maintenance Procedures
### Regular Maintenance Tasks
#### System Updates
```bash
# Update system packages
ansible arch-vps -m pacman -a "update_cache=yes upgrade=yes" --become
# Update container images
ansible-playbook site.yml --tags containers,image-pull --ask-vault-pass
```
#### Certificate Renewal
```bash
# Certificates renew automatically, but monitor logs
ssh root@your-vps "journalctl -u caddy | grep -i 'certificate\|renewal'"
# Force certificate renewal if needed
ssh root@your-vps "systemctl reload caddy"
```
#### Database Maintenance
```bash
# PostgreSQL maintenance
ssh root@your-vps "sudo -u postgres psql -h /var/run/postgresql -c 'VACUUM ANALYZE;'"
# Database backups
ssh root@your-vps "sudo -u postgres pg_dumpall -h /var/run/postgresql > /backup/postgres-backup-\$(date +%Y%m%d).sql"
```
### Backup Procedures
#### Configuration Backup
```bash
# Backup Ansible configurations (run from control machine)
tar -czf rick-infra-backup-$(date +%Y%m%d).tar.gz \
inventory/ host_vars/ roles/ docs/ *.yml
# Backup vault files separately and securely
ansible-vault view host_vars/arch-vps/vault.yml > secure-backup/vault-$(date +%Y%m%d).yml
```
#### Data Backup
```bash
# Database backup
ssh root@your-vps "sudo -u postgres pg_dump -h /var/run/postgresql authentik > /backup/authentik-$(date +%Y%m%d).sql"
ssh root@your-vps "sudo -u postgres pg_dump -h /var/run/postgresql gitea > /backup/gitea-$(date +%Y%m%d).sql"
# Application data backup
ssh root@your-vps "tar -czf /backup/authentik-media-$(date +%Y%m%d).tar.gz -C /opt/authentik media"
ssh root@your-vps "tar -czf /backup/gitea-data-$(date +%Y%m%d).tar.gz -C /opt/gitea data"
```
## Documentation References
### Core Documentation
- **[Setup Guide](setup-guide.md)** - Initial VPS and Ansible setup
- **[Authentik Deployment Guide](authentik-deployment-guide.md)** - Detailed authentik deployment
- **[Architecture Decisions](architecture-decisions.md)** - Technical decision rationale
- **[Security Hardening](security-hardening.md)** - Security configuration details
### Service Integration
- **[Service Integration Guide](service-integration-guide.md)** - Adding new services
- **[Caddy Configuration](caddy-service-configuration.md)** - Reverse proxy patterns
### Role Documentation
- **[Authentik Role](../roles/authentik/README.md)** - Authentication service details
- **[PostgreSQL Role](../roles/postgresql/README.md)** - Database service details
- **[Valkey Role](../roles/valkey/README.md)** - Cache service details
- **[Caddy Role](../roles/caddy/README.md)** - Reverse proxy details
---
This comprehensive deployment guide provides the foundation for deploying and maintaining the complete rick-infra stack with emphasis on security, performance, and operational excellence through native database services and Unix socket IPC architecture.