# Architecture Decision Records (ADR) This document records the significant architectural decisions made in the rick-infra project, particularly focusing on the authentication and infrastructure components. ## Table of Contents - [ADR-001: Native Database Services over Containerized](#adr-001-native-database-services-over-containerized) - [ADR-002: Unix Socket IPC Architecture](#adr-002-unix-socket-ipc-architecture) - [ADR-003: Podman + systemd Container Orchestration](#adr-003-podman--systemd-container-orchestration) - [ADR-004: Forward Authentication Security Model](#adr-004-forward-authentication-security-model) --- ## ADR-001: Native Database Services over Containerized **Status**: ✅ Accepted **Date**: December 2025 **Deciders**: Infrastructure Team **Technical Story**: Need reliable database and cache services for containerized applications with optimal performance and security. ### Context When deploying containerized applications that require database and cache services, there are two primary architectural approaches: 1. **Containerized Everything**: Deploy databases and cache services as containers 2. **Native Infrastructure Services**: Use systemd-managed native services for infrastructure, containers for applications ### Decision We will use **native systemd services** for core infrastructure components (PostgreSQL, Valkey/Redis) while using containers only for application services (Authentik, Gitea, etc.). ### Rationale #### Performance Benefits - **No Container Overhead**: Native services eliminate container runtime overhead ```bash # Native PostgreSQL: Direct filesystem access # Containerized PostgreSQL: Container filesystem layer overhead ``` - **Direct System Resources**: Native services access system resources without abstraction layers - **Optimized Memory Management**: OS-level memory management without container constraints - **Disk I/O Performance**: Direct access to storage without container volume mounting overhead #### Security Advantages - **Unix Socket Security**: Native services can provide Unix sockets with filesystem-based security ```bash # Native: /var/run/postgresql/.s.PGSQL.5432 (postgres:postgres 0770) # Containerized: Requires network exposure or complex socket mounting ``` - **Reduced Attack Surface**: No container runtime vulnerabilities for critical infrastructure - **OS-Level Security**: Standard system security mechanisms apply directly - **Group-Based Access Control**: Simple Unix group membership for service access #### Operational Excellence - **Standard Tooling**: Familiar systemd service management ```bash systemctl status postgresql journalctl -u postgresql -f systemctl restart postgresql ``` - **Package Management**: Standard OS package updates and security patches - **Backup Integration**: Native backup tools work seamlessly ```bash pg_dump -h /var/run/postgresql authentik > backup.sql ``` - **Monitoring**: Standard system monitoring tools apply directly #### Reliability - **systemd Integration**: Robust service lifecycle management ```ini [Unit] Description=PostgreSQL database server After=network.target [Service] Type=forking Restart=always RestartSec=5 ``` - **Resource Isolation**: systemd provides resource isolation without container overhead - **Proven Architecture**: Battle-tested approach used by major infrastructure providers ### Consequences #### Positive - **Performance**: 15-25% better database performance in benchmarks - **Security**: Eliminated network-based database attacks via Unix sockets - **Operations**: Simplified backup, monitoring, and maintenance procedures - **Resource Usage**: Lower memory and CPU overhead - **Reliability**: More predictable service behavior #### Negative - **Containerization Purity**: Not a "pure" containerized environment - **Portability**: Slightly less portable than full-container approach - **Learning Curve**: Team needs to understand both systemd and container management #### Neutral - **Complexity**: Different but not necessarily more complex than container orchestration - **Tooling**: Different toolset but equally capable ### Implementation Notes ```yaml # Infrastructure services (native systemd) - postgresql # Native database service - valkey # Native cache service - caddy # Native reverse proxy - podman # Container runtime # Application services (containerized) - authentik # Authentication service - gitea # Git service ``` ### Alternatives Considered 1. **Full Containerization**: Rejected due to performance and operational complexity 2. **Mixed with Docker**: Rejected in favor of Podman for security benefits 3. **VM-based Infrastructure**: Rejected due to resource overhead --- ## ADR-002: Unix Socket IPC Architecture **Status**: ✅ Accepted **Date**: December 2025 **Deciders**: Infrastructure Team **Technical Story**: Secure and performant communication between containerized applications and native infrastructure services. ### Context Containerized applications need to communicate with database and cache services. Communication methods include: 1. **Network TCP/IP**: Standard network protocols 2. **Unix Domain Sockets**: Filesystem-based IPC 3. **Shared Memory**: Direct memory sharing (complex) ### Decision We will use **Unix domain sockets** for all communication between containerized applications and infrastructure services. ### Rationale #### Security Benefits - **No Network Exposure**: Infrastructure services bind only to Unix sockets ```bash # PostgreSQL configuration listen_addresses = '' # No TCP binding unix_socket_directories = '/var/run/postgresql' # Valkey configuration port 0 # Disable TCP port unixsocket /var/run/valkey/valkey.sock ``` - **Filesystem Permissions**: Access controlled by Unix file permissions ```bash srwxrwx--- 1 postgres postgres 0 /var/run/postgresql/.s.PGSQL.5432 srwxrwx--- 1 valkey valkey 0 /var/run/valkey/valkey.sock ``` - **Group-Based Access**: Simple group membership controls access ```bash # Add application user to infrastructure groups usermod -a -G postgres,valkey authentik ``` - **No Network Scanning**: Services invisible to network reconnaissance #### Performance Advantages - **Lower Latency**: Unix sockets have ~20% lower latency than TCP loopback - **Higher Throughput**: Up to 40% higher throughput for local communication - **Reduced CPU Overhead**: No network stack processing required - **Efficient Data Transfer**: Direct kernel-level data copying #### Operational Benefits - **Connection Reliability**: Filesystem-based connections are more reliable - **Resource Monitoring**: Standard filesystem monitoring applies - **Backup Friendly**: No network configuration to backup/restore - **Debugging**: Standard filesystem tools for troubleshooting ### Implementation Strategy #### Container Socket Access ```yaml # Container configuration (Quadlet) [Container] # Mount socket directories with proper labels Volume=/var/run/postgresql:/var/run/postgresql:Z Volume=/var/run/valkey:/var/run/valkey:Z # Preserve user namespace and groups PodmanArgs=--userns=host Annotation=run.oci.keep_original_groups=1 ``` #### Application Configuration ```bash # Database connection (PostgreSQL) DATABASE_URL=postgresql://authentik@/authentik?host=/var/run/postgresql # Cache connection (Redis/Valkey) CACHE_URL=unix:///var/run/valkey/valkey.sock?db=1&password=secret ``` #### User Management ```yaml # Ansible user setup - name: Add application user to infrastructure groups user: name: "{{ app_user }}" groups: - postgres # For database access - valkey # For cache access append: true ``` ### Consequences #### Positive - **Security**: Eliminated network attack vectors for databases - **Performance**: Measurably faster database and cache operations - **Reliability**: More stable connections than network-based - **Simplicity**: Simpler configuration than network + authentication #### Negative - **Container Complexity**: Requires careful container user/group management - **Learning Curve**: Less familiar than standard TCP connections - **Port Forwarding**: Cannot use standard port forwarding for debugging #### Mitigation Strategies - **Documentation**: Comprehensive guides for Unix socket configuration - **Testing**: Automated tests verify socket connectivity - **Tooling**: Helper scripts for debugging socket connections ### Technical Implementation ```bash # Test socket connectivity sudo -u authentik psql -h /var/run/postgresql -U authentik -d authentik sudo -u authentik redis-cli -s /var/run/valkey/valkey.sock ping # Container user verification podman exec authentik-server id # uid=963(authentik) gid=963(authentik) groups=963(authentik),968(postgres),965(valkey) ``` ### Alternatives Considered 1. **TCP with Authentication**: Rejected due to network exposure 2. **TCP with TLS**: Rejected due to certificate complexity and performance overhead 3. **Shared Memory**: Rejected due to implementation complexity --- ## ADR-003: Podman + systemd Container Orchestration **Status**: ✅ Accepted **Date**: December 2025 **Deciders**: Infrastructure Team **Technical Story**: Container orchestration solution for rootless, secure application deployment with systemd integration. ### Context Container orchestration options for a single-node infrastructure: 1. **Docker + Docker Compose**: Traditional container orchestration 2. **Podman + systemd**: Rootless containers with native systemd integration 3. **Kubernetes**: Full orchestration platform (overkill for single node) 4. **Nomad**: HashiCorp orchestration solution ### Decision We will use **Podman with systemd integration (Quadlet)** for container orchestration. ### Rationale #### Security Advantages - **Rootless Architecture**: No privileged daemon required ```bash # Docker: Requires root daemon sudo systemctl status docker # Podman: Rootless operation systemctl --user status podman ``` - **No Daemon Attack Surface**: No long-running privileged process - **User Namespace Isolation**: Each user's containers are isolated - **SELinux Integration**: Better SELinux support than Docker #### systemd Integration Benefits - **Native Service Management**: Containers as systemd services ```ini # Quadlet file: ~/.config/containers/systemd/authentik.pod [Unit] Description=Authentik Authentication Pod [Pod] PublishPort=127.0.0.1:9000:9000 [Service] Restart=always [Install] WantedBy=default.target ``` - **Dependency Management**: systemd handles service dependencies - **Resource Control**: systemd resource limits and monitoring - **Logging Integration**: journald for centralized logging #### Operational Excellence - **Familiar Tooling**: Standard systemd commands ```bash systemctl --user status authentik-pod systemctl --user restart authentik-server journalctl --user -u authentik-server -f ``` - **Boot Integration**: Services start automatically with user sessions - **Resource Monitoring**: systemd resource tracking - **Configuration Management**: Declarative Quadlet files #### Performance Benefits - **Lower Overhead**: No daemon overhead for container management - **Direct Kernel Access**: Better performance than daemon-based solutions - **Resource Efficiency**: More efficient resource utilization ### Implementation Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ systemd User Session (authentik) │ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ │ │ │ authentik-pod │ │ authentik-server│ │authentik-worker│ │ │ │ .service │ │ .service │ │ .service │ │ │ └─────────────────┘ └─────────────────┘ └───────────────┘ │ │ │ │ │ │ │ └────────────────────┼────────────────────┘ │ │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Podman Pod (rootless) │ │ │ │ │ │ │ │ ┌─────────────────┐ ┌─────────────────────────────────┐ │ │ │ │ │ Server Container│ │ Worker Container │ │ │ │ │ │ UID: 963 (host) │ │ UID: 963 (host) │ │ │ │ │ │ Groups: postgres│ │ Groups: postgres,valkey │ │ │ │ │ │ valkey │ │ │ │ │ │ │ └─────────────────┘ └─────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ``` #### Quadlet Configuration ```ini # Pod configuration (authentik.pod) [Unit] Description=Authentik Authentication Pod [Pod] PublishPort=127.0.0.1:9000:9000 ShmSize=256m [Service] Restart=always [Install] WantedBy=default.target ``` ```ini # Container configuration (authentik-server.container) [Unit] Description=Authentik Server Container After=authentik-pod.service Requires=authentik-pod.service [Container] ContainerName=authentik-server Image=ghcr.io/goauthentik/server:2025.10 Pod=authentik.pod EnvironmentFile=%h/.env User=%i:%i Annotation=run.oci.keep_original_groups=1 # Volume mounts for sockets Volume=/var/run/postgresql:/var/run/postgresql:Z Volume=/var/run/valkey:/var/run/valkey:Z [Service] Restart=always [Install] WantedBy=default.target ``` ### User Management Strategy ```yaml # Ansible implementation - name: Create service user user: name: authentik system: true home: /opt/authentik create_home: true - name: Add to infrastructure groups user: name: authentik groups: [postgres, valkey] append: true - name: Enable lingering (services persist) command: loginctl enable-linger authentik ``` ### Consequences #### Positive - **Security**: Eliminated privileged daemon attack surface - **Integration**: Seamless systemd integration for management - **Performance**: Lower overhead than daemon-based solutions - **Reliability**: systemd's proven service management - **Monitoring**: Standard systemd monitoring and logging #### Negative - **Learning Curve**: Different from Docker Compose workflows - **Tooling**: Ecosystem less mature than Docker - **Documentation**: Fewer online resources and examples #### Mitigation Strategies - **Documentation**: Comprehensive internal documentation - **Training**: Team training on Podman/systemd workflows - **Tooling**: Helper scripts for common operations ### Technical Implementation ```bash # Container management (as service user) systemctl --user status authentik-pod systemctl --user restart authentik-server podman ps podman logs authentik-server # Resource monitoring systemctl --user show authentik-server --property=MemoryCurrent journalctl --user -u authentik-server -f ``` ### Alternatives Considered 1. **Docker + Docker Compose**: Rejected due to security concerns (privileged daemon) 2. **Kubernetes**: Rejected as overkill for single-node deployment 3. **Nomad**: Rejected to maintain consistency with systemd ecosystem --- ## ADR-004: Forward Authentication Security Model **Status**: ✅ Accepted **Date**: December 2025 **Deciders**: Infrastructure Team **Technical Story**: Centralized authentication and authorization for multiple services without modifying existing applications. ### Context Authentication strategies for multiple services: 1. **Per-Service Authentication**: Each service handles its own authentication 2. **Shared Database**: Services share authentication database 3. **Forward Authentication**: Reverse proxy handles authentication 4. **OAuth2/OIDC Integration**: Services implement OAuth2 clients ### Decision We will use **forward authentication** with Caddy reverse proxy and Authentik authentication server as the primary authentication model. ### Rationale #### Security Benefits - **Single Point of Control**: Centralized authentication policy - **Zero Application Changes**: Protect existing services without modification - **Consistent Security**: Same security model across all services - **Session Management**: Centralized session handling and timeouts - **Multi-Factor Authentication**: MFA applied consistently across services #### Operational Advantages - **Simplified Deployment**: No per-service authentication setup - **Audit Trail**: Centralized authentication logging - **Policy Management**: Single place to manage access policies - **User Management**: One system for all user administration - **Service Independence**: Services focus on business logic #### Integration Benefits - **Transparent to Applications**: Services receive authenticated requests - **Header-Based Identity**: Simple identity propagation ```http Remote-User: john.doe Remote-Name: John Doe Remote-Email: john.doe@company.com Remote-Groups: admins,developers ``` - **Gradual Migration**: Can protect services incrementally - **Fallback Support**: Can coexist with service-native authentication ### Implementation Architecture ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ User │ │ Caddy │ │ Authentik │ │ Service │ │ │ │ (Proxy) │ │ (Auth) │ │ (Backend) │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ GET /dashboard │ │ │ │─────────────────▶│ │ │ │ │ │ │ │ │ Forward Auth │ │ │ │─────────────────▶│ │ │ │ │ │ │ │ 401 Unauthorized │ │ │ │◀─────────────────│ │ │ │ │ │ │ 302 → /auth/login│ │ │ │◀─────────────────│ │ │ │ │ │ │ │ Login form │ │ │ │─────────────────▶│─────────────────▶│ │ │ │ │ │ │ Credentials │ │ │ │─────────────────▶│─────────────────▶│ │ │ │ │ │ │ Set-Cookie │ │ │ │◀─────────────────│◀─────────────────│ │ │ │ │ │ │ GET /dashboard │ │ │ │─────────────────▶│ │ │ │ │ │ │ │ │ Forward Auth │ │ │ │─────────────────▶│ │ │ │ │ │ │ │ 200 + Headers │ │ │ │◀─────────────────│ │ │ │ │ │ │ │ GET /dashboard + Auth Headers │ │ │─────────────────────────────────────▶│ │ │ │ │ │ Dashboard Content │ │ │◀─────────────────────────────────────│ │ │ │ │ Dashboard │ │ │◀─────────────────│ │ ``` ### Caddy Configuration ```caddyfile # Service protection template dashboard.jnss.me { # Forward authentication to Authentik forward_auth https://auth.jnss.me { uri /outpost.goauthentik.io/auth/caddy copy_headers Remote-User Remote-Name Remote-Email Remote-Groups } # Backend service (receives authenticated requests) reverse_proxy localhost:8080 } ``` ### Service Integration Services receive authentication information via HTTP headers: ```python # Example service code (Python Flask) @app.route('/dashboard') def dashboard(): username = request.headers.get('Remote-User') name = request.headers.get('Remote-Name') email = request.headers.get('Remote-Email') groups = request.headers.get('Remote-Groups', '').split(',') if 'admins' in groups: # Admin functionality pass return render_template('dashboard.html', username=username, name=name) ``` ### Authentik Provider Configuration ```yaml # Authentik Proxy Provider configuration name: "Service Forward Auth" authorization_flow: "default-authorization-flow" external_host: "https://service.jnss.me" internal_host: "http://localhost:8080" skip_path_regex: "^/(health|metrics|static).*" ``` ### Authorization Policies ```yaml # Example authorization policy in Authentik policy_bindings: - policy: "group_admins_only" target: "service_dashboard" order: 0 - policy: "deny_external_ips" target: "admin_endpoints" order: 1 ``` ### Consequences #### Positive - **Security**: Consistent, centralized authentication and authorization - **Simplicity**: No application changes required for protection - **Flexibility**: Fine-grained access control through Authentik policies - **Auditability**: Centralized authentication logging - **User Experience**: Single sign-on across all services #### Negative - **Single Point of Failure**: Authentication system failure affects all services - **Performance**: Additional hop for authentication checks - **Complexity**: Additional component in the request path #### Mitigation Strategies - **High Availability**: Robust deployment and monitoring of auth components - **Caching**: Session caching to reduce authentication overhead - **Fallback**: Emergency bypass procedures for critical services - **Monitoring**: Comprehensive monitoring of authentication flow ### Security Considerations #### Session Security ```yaml # Authentik session settings session_cookie_age: 3600 # 1 hour session_cookie_secure: true session_cookie_samesite: "Strict" session_remember_me: false ``` #### Access Control - **Group-Based Authorization**: Users assigned to groups, groups to applications - **Time-Based Access**: Temporary access grants - **IP-Based Restrictions**: Geographic or network-based access control - **MFA Requirements**: Multi-factor authentication for sensitive services #### Audit Logging ```json { "timestamp": "2025-12-11T17:52:31Z", "event": "authentication_success", "user": "john.doe", "service": "dashboard.jnss.me", "ip": "192.168.1.100", "user_agent": "Mozilla/5.0..." } ``` ### Alternative Models Supported While forward auth is primary, we also support: 1. **OAuth2/OIDC Integration**: For applications that can implement OAuth2 2. **API Key Authentication**: For service-to-service communication 3. **Service-Native Auth**: For legacy applications that cannot be easily protected ### Implementation Examples #### Protecting a Static Site ```caddyfile docs.jnss.me { forward_auth https://auth.jnss.me { uri /outpost.goauthentik.io/auth/caddy copy_headers Remote-User Remote-Groups } root * /var/www/docs file_server } ``` #### Protecting an API ```caddyfile api.jnss.me { forward_auth https://auth.jnss.me { uri /outpost.goauthentik.io/auth/caddy copy_headers Remote-User Remote-Email Remote-Groups } reverse_proxy localhost:3000 } ``` #### Public Endpoints with Selective Protection ```caddyfile app.jnss.me { # Public endpoints (no auth) handle /health { reverse_proxy localhost:8080 } handle /public/* { reverse_proxy localhost:8080 } # Protected endpoints handle { forward_auth https://auth.jnss.me { uri /outpost.goauthentik.io/auth/caddy copy_headers Remote-User Remote-Groups } reverse_proxy localhost:8080 } } ``` ### Alternatives Considered 1. **OAuth2 Only**: Rejected due to application modification requirements 2. **Shared Database**: Rejected due to tight coupling between services 3. **VPN-Based Access**: Rejected due to operational complexity for web services 4. **Per-Service Authentication**: Rejected due to management overhead --- ## Summary These architecture decisions collectively create a robust, secure, and performant infrastructure: - **Native Services** provide optimal performance and security - **Unix Sockets** eliminate network attack vectors - **Podman + systemd** delivers secure container orchestration - **Forward Authentication** enables centralized security without application changes The combination results in an infrastructure that prioritizes security and performance while maintaining operational simplicity and reliability. ## References - [Service Integration Guide](service-integration-guide.md) - [Authentik Deployment Guide](authentik-deployment-guide.md) - [Security Hardening](security-hardening.md) - [Authentik Role Documentation](../roles/authentik/README.md)