From 3506e5501620aa28a49167c1be10d5476abb9cd8 Mon Sep 17 00:00:00 2001 From: Joakim Date: Sun, 14 Dec 2025 16:56:50 +0100 Subject: [PATCH] Migrate to rootful container architecture with infrastructure fact pattern MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Major architectural change from rootless user services to system-level (rootful) containers to enable group-based Unix socket access for containerized applications. Infrastructure Changes: - PostgreSQL: Export postgres-clients group GID as Ansible fact - Valkey: Export valkey-clients group GID as Ansible fact - Valkey: Add socket-fix service to maintain correct socket group ownership - Both: Set socket directories to 770 with client group ownership Authentik Role Refactoring: - Remove rootless container configuration (subuid/subgid, lingering, user systemd) - Deploy Quadlet files to /etc/containers/systemd/ (system-level) - Use dynamic GID facts in container PodmanArgs (--group-add) - Simplify user creation to system user with infrastructure group membership - Update handlers for system scope service management - Remove unnecessary container security options (no user namespace isolation) Container Template Changes: - Pod: Remove --userns args, change WantedBy to multi-user.target - Containers: Replace Annotation with PodmanArgs using dynamic GIDs - Remove /dev/shm mounts and SecurityLabelDisable (not needed for rootful) - Change WantedBy to multi-user.target for system services Documentation Updates: - Add ADR-005: Rootful Containers with Infrastructure Fact Pattern - Update ADR-003: Podman + systemd for system-level deployment - Update authentik-deployment-guide.md for system scope commands - Update service-integration-guide.md with rootful pattern examples - Document discarded rootless approach and rationale Why Rootful Succeeds: - Direct UID/GID mapping preserves supplementary groups - Container process groups match host socket group ownership - No user namespace remapping breaking permissions Why Rootless Failed (Discarded): - User namespace UID/GID remapping broke group-based socket access - Supplementary groups remapped into subgid range didn't match socket ownership - Even with --userns=host and keep_original_groups, permissions failed Pattern Established: - Infrastructure roles create client groups and export GID facts - Application roles validate facts and consume in container templates - Rootful containers run as dedicated users with --group-add for socket access - System-level deployment provides standard systemd service management Deployment Validated: - Services in /system.slice/ ✓ - Process groups: 961 (valkey-clients), 962 (postgres-clients), 966 (authentik) ✓ - Socket permissions: 770 with client groups ✓ - HTTP endpoint responding ✓ --- docs/architecture-decisions.md | 356 ++++++++++++++++-- docs/authentik-deployment-guide.md | 77 ++-- docs/service-integration-guide.md | 114 ++++-- roles/authentik/defaults/main.yml | 17 +- roles/authentik/handlers/main.yml | 34 +- roles/authentik/tasks/cache.yml | 11 +- roles/authentik/tasks/database.yml | 11 +- roles/authentik/tasks/main.yml | 40 +- roles/authentik/tasks/user.yml | 33 +- .../templates/authentik-server.container | 9 +- .../templates/authentik-worker.container | 9 +- roles/authentik/templates/authentik.pod | 3 +- roles/postgresql/defaults/main.yml | 4 + roles/postgresql/tasks/main.yml | 26 +- roles/postgresql/templates/postgresql.conf.j2 | 5 +- roles/valkey/defaults/main.yml | 16 +- roles/valkey/tasks/main.yml | 77 ++-- .../templates/valkey-socket-fix.service.j2 | 15 + roles/valkey/templates/valkey.conf.j2 | 11 +- roles/valkey/templates/valkey.service.j2 | 3 + site.yml | 4 +- 21 files changed, 587 insertions(+), 288 deletions(-) create mode 100644 roles/valkey/templates/valkey-socket-fix.service.j2 diff --git a/docs/architecture-decisions.md b/docs/architecture-decisions.md index a019537..f6b0f81 100644 --- a/docs/architecture-decisions.md +++ b/docs/architecture-decisions.md @@ -8,6 +8,7 @@ This document records the significant architectural decisions made in the rick-i - [ADR-002: Unix Socket IPC Architecture](#adr-002-unix-socket-ipc-architecture) - [ADR-003: Podman + systemd Container Orchestration](#adr-003-podman--systemd-container-orchestration) - [ADR-004: Forward Authentication Security Model](#adr-004-forward-authentication-security-model) +- [ADR-005: Rootful Containers with Infrastructure Fact Pattern](#adr-005-rootful-containers-with-infrastructure-fact-pattern) --- @@ -270,8 +271,9 @@ podman exec authentik-server id **Status**: ✅ Accepted **Date**: December 2025 +**Updated**: December 2025 (System-level deployment pattern) **Deciders**: Infrastructure Team -**Technical Story**: Container orchestration solution for rootless, secure application deployment with systemd integration. +**Technical Story**: Container orchestration solution for secure application deployment with systemd integration. ### Context @@ -284,40 +286,42 @@ Container orchestration options for a single-node infrastructure: ### Decision -We will use **Podman with systemd integration (Quadlet)** for container orchestration. +We will use **Podman with systemd integration (Quadlet)** for container orchestration, deployed as system-level services (rootful containers running as dedicated users). ### Rationale #### Security Advantages -- **Rootless Architecture**: No privileged daemon required +- **No Daemon Required**: No privileged daemon attack surface ```bash # Docker: Requires root daemon sudo systemctl status docker - # Podman: Rootless operation - systemctl --user status podman + # Podman: Daemonless operation + podman ps # No daemon needed ``` -- **No Daemon Attack Surface**: No long-running privileged process -- **User Namespace Isolation**: Each user's containers are isolated +- **Dedicated Service Users**: Containers run as dedicated system users (not root) +- **Group-Based Access Control**: Unix group membership controls infrastructure access - **SELinux Integration**: Better SELinux support than Docker #### systemd Integration Benefits -- **Native Service Management**: Containers as systemd services +- **Native Service Management**: Containers as system-level systemd services ```ini - # Quadlet file: ~/.config/containers/systemd/authentik.pod + # Quadlet file: /etc/containers/systemd/authentik.pod [Unit] Description=Authentik Authentication Pod [Pod] - PublishPort=127.0.0.1:9000:9000 + PublishPort=0.0.0.0:9000:9000 + ShmSize=256m [Service] Restart=always + TimeoutStartSec=900 [Install] - WantedBy=default.target + WantedBy=multi-user.target ``` - **Dependency Management**: systemd handles service dependencies - **Resource Control**: systemd resource limits and monitoring @@ -327,11 +331,11 @@ We will use **Podman with systemd integration (Quadlet)** for container orchestr - **Familiar Tooling**: Standard systemd commands ```bash - systemctl --user status authentik-pod - systemctl --user restart authentik-server - journalctl --user -u authentik-server -f + systemctl status authentik-pod + systemctl restart authentik-server + journalctl -u authentik-server -f ``` -- **Boot Integration**: Services start automatically with user sessions +- **Boot Integration**: Services start automatically at system boot - **Resource Monitoring**: systemd resource tracking - **Configuration Management**: Declarative Quadlet files @@ -345,7 +349,7 @@ We will use **Podman with systemd integration (Quadlet)** for container orchestr ``` ┌─────────────────────────────────────────────────────────────┐ -│ systemd User Session (authentik) │ +│ systemd System Services (/system.slice/) │ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ │ │ │ authentik-pod │ │ authentik-server│ │authentik-worker│ │ @@ -355,15 +359,23 @@ We will use **Podman with systemd integration (Quadlet)** for container orchestr │ └────────────────────┼────────────────────┘ │ │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ -│ │ Podman Pod (rootless) │ │ +│ │ Podman Pod (rootful, dedicated user) │ │ │ │ │ │ │ │ ┌─────────────────┐ ┌─────────────────────────────────┐ │ │ │ │ │ Server Container│ │ Worker Container │ │ │ -│ │ │ UID: 963 (host) │ │ UID: 963 (host) │ │ │ -│ │ │ Groups: postgres│ │ Groups: postgres,valkey │ │ │ -│ │ │ valkey │ │ │ │ │ +│ │ │ User: 966:966 │ │ User: 966:966 │ │ │ +│ │ │ Groups: 961,962 │ │ Groups: 961,962 │ │ │ +│ │ │ (valkey,postgres)│ │ (valkey,postgres) │ │ │ │ │ └─────────────────┘ └─────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ + │ + │ Group-based access to infrastructure + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Infrastructure Services │ +│ PostgreSQL: /var/run/postgresql (postgres:postgres-clients)│ +│ Valkey: /var/run/valkey (valkey:valkey-clients) │ └─────────────────────────────────────────────────────────────┘ ``` @@ -396,19 +408,22 @@ Requires=authentik-pod.service ContainerName=authentik-server Image=ghcr.io/goauthentik/server:2025.10 Pod=authentik.pod -EnvironmentFile=%h/.env -User=%i:%i -Annotation=run.oci.keep_original_groups=1 +EnvironmentFile=/opt/authentik/.env +User=966:966 +PodmanArgs=--group-add 962 --group-add 961 -# Volume mounts for sockets +# Volume mounts for sockets and data +Volume=/opt/authentik/media:/media +Volume=/opt/authentik/data:/data Volume=/var/run/postgresql:/var/run/postgresql:Z Volume=/var/run/valkey:/var/run/valkey:Z [Service] Restart=always +TimeoutStartSec=300 [Install] -WantedBy=default.target +WantedBy=multi-user.target ``` ### User Management Strategy @@ -418,20 +433,17 @@ WantedBy=default.target - name: Create service user user: name: authentik + group: authentik + groups: [postgres-clients, valkey-clients] system: true + shell: /bin/bash home: /opt/authentik create_home: true - -- name: Add to infrastructure groups - user: - name: authentik - groups: [postgres, valkey] append: true - -- name: Enable lingering (services persist) - command: loginctl enable-linger authentik ``` +**Note**: Infrastructure roles (PostgreSQL, Valkey) export client group GIDs as Ansible facts (`postgresql_client_group_gid`, `valkey_client_group_gid`) which are consumed by application container templates for dynamic `--group-add` arguments. + ### Consequences #### Positive @@ -457,15 +469,20 @@ WantedBy=default.target ### Technical Implementation ```bash -# Container management (as service user) -systemctl --user status authentik-pod -systemctl --user restart authentik-server +# Container management (system scope) +systemctl status authentik-pod +systemctl restart authentik-server podman ps podman logs authentik-server # Resource monitoring -systemctl --user show authentik-server --property=MemoryCurrent -journalctl --user -u authentik-server -f +systemctl show authentik-server --property=MemoryCurrent +journalctl -u authentik-server -f + +# Verify container groups +ps aux | grep authentik-server | head -1 | awk '{print $2}' | \ + xargs -I {} cat /proc/{}/status | grep Groups +# Output: Groups: 961 962 966 ``` ### Alternatives Considered @@ -763,6 +780,268 @@ app.jnss.me { --- +## ADR-005: Rootful Containers with Infrastructure Fact Pattern + +**Status**: ✅ Accepted +**Date**: December 2025 +**Deciders**: Infrastructure Team +**Technical Story**: Enable containerized applications to access native infrastructure services (PostgreSQL, Valkey) via Unix sockets with group-based permissions. + +### Context + +Containerized applications need to access infrastructure services (PostgreSQL, Valkey) through Unix sockets with filesystem-based permission controls. The permission model requires: + +1. **Socket directories** owned by service groups (`postgres-clients`, `valkey-clients`) +2. **Application users** added to these groups for access +3. **Container processes** must preserve group membership to access sockets + +Two approaches were evaluated: + +1. **Rootless containers (user namespace)**: Containers run in user namespace with UID/GID remapping +2. **Rootful containers (system services)**: Containers run as dedicated system users without namespace isolation + +### Decision + +We will use **rootful containers deployed as system-level systemd services** with an **Infrastructure Fact Pattern** where infrastructure roles export client group GIDs as Ansible facts for application consumption. + +### Rationale + +#### Why Rootful Succeeds + +**Direct UID/GID Mapping**: +```bash +# Host: authentik user UID 966, groups: 966 (authentik), 961 (valkey-clients), 962 (postgres-clients) +# Container User=966:966 with PodmanArgs=--group-add 961 --group-add 962 + +# Inside container: +id +# uid=966(authentik) gid=966(authentik) groups=966(authentik),961(valkey-clients),962(postgres-clients) + +# Socket access works: +ls -l /var/run/postgresql/.s.PGSQL.5432 +# srwxrwx--- 1 postgres postgres-clients 0 ... /var/run/postgresql/.s.PGSQL.5432 +``` + +**Group membership preserved**: Container process has GIDs 961 and 962, matching socket group ownership. + +#### Why Rootless Failed (Discarded Approach) + +**User Namespace UID/GID Remapping**: +```bash +# Host: authentik user UID 100000, subuid range 200000-265535 +# Container User=%i:%i with --userns=host --group-add=keep-groups + +# User namespace remaps: +# Host UID 100000 → Container UID 100000 (root in namespace) +# Host GID 961 → Container GID 200961 (remapped into subgid range) +# Host GID 962 → Container GID 200962 (remapped into subgid range) + +# Socket ownership on host: +# srwxrwx--- 1 postgres postgres-clients (GID 962) + +# Container process groups: 200961, 200962 (remapped) +# Socket expects: GID 962 (not remapped) +# Result: Permission denied ❌ +``` + +**Root cause**: User namespace supplementary group remapping breaks group-based socket access even with `--userns=host`, `--group-add=keep-groups`, and `Annotation=run.oci.keep_original_groups=1`. + +### Infrastructure Fact Pattern + +#### Infrastructure Roles Export GIDs + +Infrastructure services create client groups and export their GIDs as Ansible facts: + +```yaml +# PostgreSQL role: roles/postgresql/tasks/main.yml +- name: Create PostgreSQL client access group + group: + name: postgres-clients + system: true + +- name: Get PostgreSQL client group GID + shell: "getent group postgres-clients | cut -d: -f3" + register: postgresql_client_group_lookup + changed_when: false + +- name: Set PostgreSQL client group GID as fact + set_fact: + postgresql_client_group_gid: "{{ postgresql_client_group_lookup.stdout }}" +``` + +```yaml +# Valkey role: roles/valkey/tasks/main.yml +- name: Create Valkey client access group + group: + name: valkey-clients + system: true + +- name: Get Valkey client group GID + shell: "getent group valkey-clients | cut -d: -f3" + register: valkey_client_group_lookup + changed_when: false + +- name: Set Valkey client group GID as fact + set_fact: + valkey_client_group_gid: "{{ valkey_client_group_lookup.stdout }}" +``` + +#### Application Roles Consume Facts + +Application roles validate and consume infrastructure facts: + +```yaml +# Authentik role: roles/authentik/tasks/main.yml +- name: Validate infrastructure facts are available + assert: + that: + - postgresql_client_group_gid is defined + - valkey_client_group_gid is defined + fail_msg: | + Required infrastructure facts are not available. + Ensure PostgreSQL and Valkey roles have run first. + +- name: Create authentik user with infrastructure groups + user: + name: authentik + groups: [postgres-clients, valkey-clients] + append: true +``` + +```ini +# Container template: roles/authentik/templates/authentik-server.container +[Container] +User={{ authentik_uid }}:{{ authentik_gid }} +PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }} +``` + +### Implementation Details + +#### System-Level Deployment + +```ini +# Quadlet files deployed to /etc/containers/systemd/ (not ~/.config/) +# Pod: /etc/containers/systemd/authentik.pod +[Unit] +Description=Authentik Authentication Pod + +[Pod] +PublishPort=0.0.0.0:9000:9000 +ShmSize=256m + +[Service] +Restart=always + +[Install] +WantedBy=multi-user.target # System target, not default.target +``` + +```ini +# Container: /etc/containers/systemd/authentik-server.container +[Container] +User=966:966 +PodmanArgs=--group-add 962 --group-add 961 + +Volume=/var/run/postgresql:/var/run/postgresql:Z +Volume=/var/run/valkey:/var/run/valkey:Z +``` + +#### Service Management + +```bash +# System scope (not user scope) +systemctl status authentik-pod +systemctl restart authentik-server +journalctl -u authentik-server -f + +# Verify container location +systemctl status authentik-server | grep CGroup +# CGroup: /system.slice/authentik-server.service ✓ +``` + +### Special Case: Valkey Socket Group Fix + +Valkey doesn't natively support socket group configuration (unlike PostgreSQL's `unix_socket_group`). A helper service ensures correct socket permissions: + +```ini +# /etc/systemd/system/valkey-socket-fix.service +[Unit] +Description=Fix Valkey socket group ownership and permissions +BindsTo=valkey.service +After=valkey.service + +[Service] +Type=oneshot +ExecStart=/bin/sh -c 'i=0; while [ ! -S /var/run/valkey/valkey.sock ] && [ $i -lt 100 ]; do sleep 0.1; i=$((i+1)); done' +ExecStart=/bin/chgrp valkey-clients /var/run/valkey/valkey.sock +ExecStart=/bin/chmod 770 /var/run/valkey/valkey.sock +RemainAfterExit=yes + +[Install] +WantedBy=multi-user.target +``` + +Triggered by Valkey service: + +```ini +# /etc/systemd/system/valkey.service (excerpt) +[Unit] +Wants=valkey-socket-fix.service +``` + +### Consequences + +#### Positive + +- **Socket Access Works**: Group-based permissions function correctly +- **Security**: Containers run as dedicated users (not root), no privileged daemon +- **Portability**: Dynamic GID facts work across different hosts +- **Consistency**: Same pattern for all containerized applications +- **Simplicity**: No user namespace complexity, standard systemd service management + +#### Negative + +- **Not "Pure" Rootless**: Containers require root for systemd service deployment +- **Different from Docker**: Less familiar pattern than rootless user services + +#### Neutral + +- **System vs User Scope**: Different commands (`systemctl` vs `systemctl --user`) but equally capable +- **Deployment Location**: `/etc/containers/systemd/` vs `~/.config/` but same Quadlet functionality + +### Validation + +```bash +# Verify service location +systemctl status authentik-server | grep CGroup +# → /system.slice/authentik-server.service ✓ + +# Verify process groups +ps aux | grep authentik | head -1 | awk '{print $2}' | \ + xargs -I {} cat /proc/{}/status | grep Groups +# → Groups: 961 962 966 ✓ + +# Verify socket permissions +ls -l /var/run/postgresql/.s.PGSQL.5432 +# → srwxrwx--- postgres postgres-clients ✓ + +ls -l /var/run/valkey/valkey.sock +# → srwxrwx--- valkey valkey-clients ✓ + +# Verify HTTP endpoint +curl -I http://127.0.0.1:9000/ +# → HTTP/1.1 302 Found ✓ +``` + +### Alternatives Considered + +1. **Rootless with user namespace** - Discarded due to GID remapping breaking group-based socket access +2. **TCP-only connections** - Rejected to maintain Unix socket security and performance benefits +3. **Hardcoded GIDs** - Rejected for portability; facts provide dynamic resolution +4. **Directory permissions (777)** - Rejected for security; group-based access more restrictive + +--- + ## Summary These architecture decisions collectively create a robust, secure, and performant infrastructure: @@ -771,6 +1050,7 @@ These architecture decisions collectively create a robust, secure, and performan - **Unix Sockets** eliminate network attack vectors - **Podman + systemd** delivers secure container orchestration - **Forward Authentication** enables centralized security without application changes +- **Rootful Container Pattern** enables group-based socket access with infrastructure fact sharing The combination results in an infrastructure that prioritizes security and performance while maintaining operational simplicity and reliability. diff --git a/docs/authentik-deployment-guide.md b/docs/authentik-deployment-guide.md index 3d561ee..2f1b87e 100644 --- a/docs/authentik-deployment-guide.md +++ b/docs/authentik-deployment-guide.md @@ -8,18 +8,22 @@ This guide covers the complete deployment process for Authentik, a modern authen - **Native PostgreSQL** - High-performance database with Unix socket IPC - **Native Valkey** - Redis-compatible cache with Unix socket IPC -- **Rootless Podman** - Secure container orchestration via systemd/Quadlet +- **Podman Containers** - System-level container orchestration via systemd/Quadlet - **Caddy Reverse Proxy** - TLS termination and forward authentication ## Architecture Summary ``` -┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ -│ Internet │ │ Caddy Proxy │ │ Authentik Pod │ -│ │───▶│ auth.jnss.me │───▶│ │ -│ HTTPS/443 │ │ TLS + Forward │ │ Server + Worker │ -└─────────────────┘ │ Auth │ │ Containers │ - └─────────────────┘ └─────────────────┘ +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────┐ +│ Internet │ │ Caddy Proxy │ │ Authentik (systemd) │ +│ │───▶│ auth.jnss.me │───▶│ /system.slice/ │ +│ HTTPS/443 │ │ TLS + Forward │ │ │ +└─────────────────┘ │ Auth │ │ ┌─────────────────────┐ │ + └─────────────────┘ │ │ Pod + Server/Worker │ │ + │ │ User: 966:966 │ │ + │ │ Groups: 961,962 │ │ + │ └─────────────────────┘ │ + └─────────────────────────┘ │ ┌─────────────────────────────────┼─────────────────┐ │ Host Infrastructure │ │ @@ -28,6 +32,7 @@ This guide covers the complete deployment process for Authentik, a modern authen │ │ PostgreSQL │ │ Valkey │ │ │ │ (Native) │ │ (Redis-compatible) │ │ │ │ Unix Socket │ │ Unix Socket │ │ + │ │ Group: 962 │ │ Group: 961 │ │ │ └─────────────┘ └─────────────────────────┘ │ └─────────────────────────────────────────────────────┘ ``` @@ -56,8 +61,8 @@ Expected output: All services should show `active` - Socket location: `/var/run/valkey/valkey.sock` 3. **Podman Container Runtime** - - Rootless container support configured - - systemd user session support enabled + - Container runtime installed + - systemd integration (Quadlet) configured 4. **Caddy Web Server** - TLS/SSL termination configured @@ -202,14 +207,18 @@ ssh root@your-vps "systemctl --user -M authentik@ status" After deployment completion: ```bash -# Check systemd user services for authentik user -ssh root@your-vps "systemctl --user -M authentik@ list-units 'authentik*'" +# Check systemd services (system scope) +ssh root@your-vps "systemctl list-units 'authentik*'" + +# Verify service location +ssh root@your-vps "systemctl status authentik-server | grep CGroup" +# Expected: /system.slice/authentik-server.service # Verify containers are running -ssh root@your-vps "sudo -u authentik podman ps" +ssh root@your-vps "podman ps" # Check pod status -ssh root@your-vps "sudo -u authentik podman pod ps" +ssh root@your-vps "podman pod ps" ``` ### Step 5: Health Check Verification @@ -329,7 +338,11 @@ Error: failed to connect to database: permission denied ```bash # Check authentik user group membership ssh root@your-vps "groups authentik" -# Should show: authentik postgres valkey +# Should show: authentik postgres-clients valkey-clients + +# Verify container process groups +ssh root@your-vps "ps aux | grep authentik-server | head -1 | awk '{print \$2}' | xargs -I {} cat /proc/{}/status | grep Groups" +# Should show: Groups: 961 962 966 (valkey-clients postgres-clients authentik) # Verify socket permissions ssh root@your-vps "ls -la /var/run/postgresql/ /var/run/valkey/" @@ -351,7 +364,7 @@ Error: bind: address already in use (port 9000) ssh root@your-vps "netstat -tulpn | grep 9000" # Stop conflicting services -ssh root@your-vps "systemctl --user -M authentik@ stop authentik-pod" +ssh root@your-vps "systemctl stop authentik-pod" # Restart with correct configuration ansible-playbook site.yml --tags authentik,containers --ask-vault-pass @@ -398,27 +411,29 @@ ansible-playbook site.yml --tags authentik,cache --ask-vault-pass ```bash # Check container logs -ssh root@your-vps "sudo -u authentik podman logs authentik-server" -ssh root@your-vps "sudo -u authentik podman logs authentik-worker" +ssh root@your-vps "podman logs authentik-server" +ssh root@your-vps "podman logs authentik-worker" # Inspect container configuration -ssh root@your-vps "sudo -u authentik podman inspect authentik-server" +ssh root@your-vps "podman inspect authentik-server" # Check container user/group mapping -ssh root@your-vps "sudo -u authentik podman exec authentik-server id" +ssh root@your-vps "podman exec authentik-server id" +# Expected: uid=966(authentik) gid=966(authentik) groups=966(authentik),961(valkey-clients),962(postgres-clients) ``` #### Service Status Verification ```bash # Check all authentik systemd services -ssh root@your-vps "systemctl --user -M authentik@ status authentik-pod authentik-server authentik-worker" +ssh root@your-vps "systemctl status authentik-pod authentik-server authentik-worker" # View service dependencies -ssh root@your-vps "systemctl --user -M authentik@ list-dependencies authentik-pod" +ssh root@your-vps "systemctl list-dependencies authentik-pod" -# Check user session status -ssh root@your-vps "loginctl show-user authentik" +# Verify services are in system.slice +ssh root@your-vps "systemctl status authentik-server | grep CGroup" +# Expected: /system.slice/authentik-server.service ``` #### Network Connectivity Testing @@ -440,12 +455,12 @@ curl -v https://auth.jnss.me/ ```bash # Authentik application logs -ssh root@your-vps "sudo -u authentik cat /opt/authentik/logs/server.log" -ssh root@your-vps "sudo -u authentik cat /opt/authentik/logs/worker.log" +ssh root@your-vps "cat /opt/authentik/logs/server.log" +ssh root@your-vps "cat /opt/authentik/logs/worker.log" # systemd service logs -ssh root@your-vps "journalctl --user -M authentik@ -u authentik-server -f" -ssh root@your-vps "journalctl --user -M authentik@ -u authentik-worker -f" +ssh root@your-vps "journalctl -u authentik-server -f" +ssh root@your-vps "journalctl -u authentik-worker -f" # Caddy logs for reverse proxy issues ssh root@your-vps "journalctl -u caddy -f" @@ -475,10 +490,10 @@ INFO authentik.core.cache: Connected to cache via unix socket ```bash # Monitor container resource usage -ssh root@your-vps "sudo -u authentik podman stats" +ssh root@your-vps "podman stats" # Monitor service memory usage -ssh root@your-vps "systemctl --user -M authentik@ status authentik-server | grep Memory" +ssh root@your-vps "systemctl status authentik-server | grep Memory" # Monitor database connections ssh root@your-vps "sudo -u postgres psql -h /var/run/postgresql -c 'SELECT * FROM pg_stat_activity;'" @@ -585,10 +600,10 @@ ansible-playbook site.yml --tags authentik,image-pull --ask-vault-pass ```bash # Emergency service restart -ssh root@your-vps "systemctl --user -M authentik@ restart authentik-pod" +ssh root@your-vps "systemctl restart authentik-pod" # Fallback: Direct container management -ssh root@your-vps "sudo -u authentik podman pod restart authentik" +ssh root@your-vps "podman pod restart authentik" # Last resort: Full service rebuild ansible-playbook site.yml --tags authentik --ask-vault-pass --limit arch-vps diff --git a/docs/service-integration-guide.md b/docs/service-integration-guide.md index 268e449..32924e7 100644 --- a/docs/service-integration-guide.md +++ b/docs/service-integration-guide.md @@ -6,20 +6,24 @@ This guide explains how to add new containerized services to rick-infra with Pos Rick-infra provides a standardized approach for containerized services to access infrastructure services through Unix sockets, maintaining security while providing optimal performance. +**Architecture**: Services are deployed as **system-level (rootful) containers** running as dedicated users with group-based access to infrastructure sockets. Infrastructure roles (PostgreSQL, Valkey) export client group GIDs as Ansible facts, which application roles consume for dynamic container configuration. + +**Note**: A previous rootless approach was evaluated but discarded due to user namespace UID/GID remapping breaking group-based socket permissions. See [ADR-005](architecture-decisions.md#adr-005-rootful-containers-with-infrastructure-fact-pattern) for details. + ## Architecture Pattern ``` ┌─────────────────────────────────────────────────────────────┐ -│ Application Service (Podman Container) │ +│ systemd System Service (/system.slice/) │ │ │ │ ┌─────────────────┐ │ │ │ Your Container │ │ -│ │ UID: service │ (host user namespace) │ -│ │ Groups: service,│ │ -│ │ postgres, │ (supplementary groups preserved) │ -│ │ valkey │ │ +│ │ User: UID:GID │ (dedicated system user) │ +│ │ Groups: GID, │ │ +│ │ 961,962 │ (postgres-clients, valkey-clients) │ │ └─────────────────┘ │ │ │ │ +│ │ PodmanArgs=--group-add 962 --group-add 961 │ │ └─────────────────────┐ │ └─────────────────────────────────│───────────────────────────┘ │ @@ -28,38 +32,42 @@ Rick-infra provides a standardized approach for containerized services to access │ │ │ PostgreSQL Unix Socket │ │ /var/run/postgresql/ │ + │ Owner: postgres:postgres- │ + │ clients (GID 962) │ │ │ │ Valkey Unix Socket │ - │ /var/run/valkey/ │ + │ /var/run/valkey/ │ + │ Owner: valkey:valkey-clients │ + │ (GID 961) │ └──────────────────────────────┘ ``` ## Prerequisites Your service must be deployed as: -1. **Systemd user service** (via Quadlet) -2. **Dedicated system user** -3. **Podman container** (rootless) +1. **System-level systemd service** (via Quadlet) +2. **Dedicated system user** with infrastructure group membership +3. **Podman container** (rootful, running as dedicated user) ## Step 1: User Setup Create a dedicated system user for your service and add it to infrastructure groups: ```yaml +- name: Create service group + group: + name: myservice + system: true + - name: Create service user user: name: myservice + group: myservice + groups: [postgres-clients, valkey-clients] system: true - shell: /bin/false + shell: /bin/bash home: /opt/myservice create_home: true - -- name: Add service user to infrastructure groups - user: - name: myservice - groups: - - postgres # For PostgreSQL access - - valkey # For Valkey/Redis access append: true ``` @@ -72,33 +80,37 @@ Create a dedicated system user for your service and add it to infrastructure gro Description=My Service Pod [Pod] -PublishPort=127.0.0.1:8080:8080 -PodmanArgs=--userns=host +PublishPort=0.0.0.0:8080:8080 +ShmSize=256m [Service] Restart=always TimeoutStartSec=900 [Install] -WantedBy=default.target +WantedBy=multi-user.target ``` **Key Points**: -- `--userns=host` preserves host user namespace -- Standard port publishing for network access +- No user namespace arguments needed (rootful containers) +- `WantedBy=multi-user.target` for system-level services +- `ShmSize` for shared memory if needed by application ### Container Configuration (`myservice.container`) ```ini [Unit] Description=My Service Container +After=myservice-pod.service +Requires=myservice-pod.service [Container] +ContainerName=myservice Image=my-service:latest Pod=myservice.pod EnvironmentFile=/opt/myservice/.env User={{ service_uid }}:{{ service_gid }} -Annotation=run.oci.keep_original_groups=1 +PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }} # Volume mounts for sockets Volume=/var/run/postgresql:/var/run/postgresql:Z @@ -112,15 +124,19 @@ Exec=my-service [Service] Restart=always +TimeoutStartSec=300 [Install] -WantedBy=default.target +WantedBy=multi-user.target ``` **Key Points**: -- `Annotation=run.oci.keep_original_groups=1` preserves supplementary groups +- `PodmanArgs=--group-add` uses dynamic GID facts from infrastructure roles - Mount socket directories with `:Z` for SELinux relabeling - Use host UID/GID for the service user +- `WantedBy=multi-user.target` for system-level services + +**Note**: The `postgresql_client_group_gid` and `valkey_client_group_gid` facts are exported by infrastructure roles and consumed in container templates. ## Step 3: Service Configuration @@ -158,7 +174,25 @@ REDIS_HOST=unix:///var/run/valkey/valkey.sock REDIS_DB=2 ``` -## Step 4: Database Setup +## Step 4: Infrastructure Fact Validation + +Before deploying containers, validate that infrastructure facts are available: + +```yaml +- name: Validate infrastructure facts are available + assert: + that: + - postgresql_client_group_gid is defined + - valkey_client_group_gid is defined + fail_msg: | + Required infrastructure facts are not available. + Ensure PostgreSQL and Valkey roles have run and exported client group GIDs. + tags: [validation] +``` + +**Why this matters**: Container templates use these facts for `--group-add` arguments. If facts are missing, containers will deploy with incorrect group membership and socket access will fail. + +## Step 5: Database Setup Add database setup tasks to your role: @@ -189,7 +223,7 @@ Add database setup tasks to your role: become_user: postgres ``` -## Step 5: Service Role Template +## Step 6: Service Role Template Create an Ansible role using this pattern: @@ -237,19 +271,22 @@ If you get permission denied errors: 1. **Check group membership**: ```bash groups myservice - # Should show: myservice postgres valkey + # Should show: myservice postgres-clients valkey-clients ``` -2. **Verify container annotations**: +2. **Verify container process groups**: ```bash - podman inspect myservice --format='{{.Config.Annotations}}' - # Should include: run.oci.keep_original_groups=1 + ps aux | grep myservice | head -1 | awk '{print $2}' | \ + xargs -I {} cat /proc/{}/status | grep Groups + # Should show GIDs matching infrastructure client groups ``` 3. **Check socket permissions**: ```bash ls -la /var/run/postgresql/ + # drwxrwx--- postgres postgres-clients ls -la /var/run/valkey/ + # drwxrwx--- valkey valkey-clients ``` ### Connection Issues @@ -283,23 +320,28 @@ If you get permission denied errors: 1. **Security**: - Use dedicated system users for each service - - Limit group memberships to required infrastructure + - Add users to infrastructure client groups (`postgres-clients`, `valkey-clients`) - Use vault variables for secrets + - Deploy Quadlet files to `/etc/containers/systemd/` (system-level) 2. **Configuration**: - Use single URL format for Redis connections - - Mount socket directories with appropriate SELinux labels - - Include `run.oci.keep_original_groups=1` annotation + - Mount socket directories with appropriate SELinux labels (`:Z`) + - Use dynamic GID facts from infrastructure roles in `PodmanArgs=--group-add` + - Set `WantedBy=multi-user.target` in Quadlet files 3. **Deployment**: + - Ensure infrastructure roles run first to export GID facts + - Validate facts are defined before container deployment - Test socket access before container deployment - Use proper dependency ordering in playbooks - Include database and cache setup tasks 4. **Monitoring**: - - Monitor socket file permissions + - Monitor socket file permissions (should be 770 with client group) - Check service logs for connection errors - - Verify group memberships after user changes + - Verify container process has correct supplementary groups + - Verify services are in `/system.slice/` ## Authentication Integration with Authentik diff --git a/roles/authentik/defaults/main.yml b/roles/authentik/defaults/main.yml index 2c4c166..095a05f 100644 --- a/roles/authentik/defaults/main.yml +++ b/roles/authentik/defaults/main.yml @@ -82,24 +82,9 @@ authentik_default_admin_password: "{{ vault_authentik_admin_password }}" authentik_container_server_name: "authentik-server" authentik_container_worker_name: "authentik-worker" -# Quadlet service directories (USER SCOPE) -authentik_quadlet_dir: "{{ authentik_user_quadlet_dir }}" -authentik_user_quadlet_dir: "{{ authentik_home }}/.config/containers/systemd" - # User session variables (set dynamically during deployment) authentik_uid: "" -# ================================================================= -# User Namespace Configuration -# ================================================================= - -# Subuid/subgid ranges for authentik user containers -# Range: 200000-265535 (65536 IDs) -authentik_subuid_start: 200000 -authentik_subuid_size: 65536 -authentik_subgid_start: 200000 -authentik_subgid_size: 65536 - # ================================================================= # Caddy Integration # ================================================================= @@ -115,7 +100,9 @@ caddy_user: "caddy" # PostgreSQL socket configuration (managed by postgresql role) postgresql_unix_socket_directories: "/var/run/postgresql" +postgresql_client_group: "postgres-clients" # Valkey socket configuration (managed by valkey role) valkey_unix_socket_path: "/var/run/valkey/valkey.sock" valkey_password: "{{ vault_valkey_password }}" +valkey_client_group: "valkey-clients" diff --git a/roles/authentik/handlers/main.yml b/roles/authentik/handlers/main.yml index a424c1e..acc8e33 100644 --- a/roles/authentik/handlers/main.yml +++ b/roles/authentik/handlers/main.yml @@ -1,14 +1,9 @@ --- -# Authentik Service Handlers (User Scope) +# Authentik Service Handlers (System Scope) -- name: reload systemd user +- name: reload systemd systemd: daemon_reload: true - scope: user - become: true - become_user: "{{ authentik_user }}" - environment: - XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}" - name: reload caddy systemd: @@ -19,44 +14,24 @@ systemd: name: "authentik-pod" state: restarted - scope: user daemon_reload: true - become: true - become_user: "{{ authentik_user }}" - environment: - XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}" - name: restart authentik server systemd: name: "{{ authentik_container_server_name }}" state: restarted - scope: user daemon_reload: true - become: true - become_user: "{{ authentik_user }}" - environment: - XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}" - name: restart authentik worker systemd: name: "{{ authentik_container_worker_name }}" state: restarted - scope: user daemon_reload: true - become: true - become_user: "{{ authentik_user }}" - environment: - XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}" - name: stop authentik services systemd: name: "{{ item }}" state: stopped - scope: user - become: true - become_user: "{{ authentik_user }}" - environment: - XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}" loop: - "{{ authentik_container_worker_name }}" - "{{ authentik_container_server_name }}" @@ -66,12 +41,7 @@ systemd: name: "{{ item }}" state: started - scope: user daemon_reload: true - become: true - become_user: "{{ authentik_user }}" - environment: - XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}" loop: - "authentik-pod" - "{{ authentik_container_server_name }}" diff --git a/roles/authentik/tasks/cache.yml b/roles/authentik/tasks/cache.yml index c3b4542..37d2d33 100644 --- a/roles/authentik/tasks/cache.yml +++ b/roles/authentik/tasks/cache.yml @@ -1,19 +1,12 @@ --- # Cache setup for Authentik - Self-contained socket permissions -- name: Add authentik user to valkey group for socket access +- name: Add authentik user to Valkey client group for socket access user: name: "{{ authentik_user }}" - groups: valkey + groups: "{{ valkey_client_group }}" append: true -- name: Ensure authentik can access Valkey socket directory - file: - path: "{{ valkey_unix_socket_path | dirname }}" - mode: '0770' - group: valkey - become: true - - name: Test Valkey socket connectivity command: > redis-cli -s {{ valkey_unix_socket_path }} diff --git a/roles/authentik/tasks/database.yml b/roles/authentik/tasks/database.yml index 99b1ef5..6a1f483 100644 --- a/roles/authentik/tasks/database.yml +++ b/roles/authentik/tasks/database.yml @@ -1,19 +1,12 @@ --- # Database setup for Authentik - Self-contained socket permissions -- name: Add authentik user to postgres group for socket access +- name: Add authentik user to PostgreSQL client group for socket access user: name: "{{ authentik_user }}" - groups: postgres + groups: "{{ postgresql_client_group }}" append: true -- name: Ensure authentik can access PostgreSQL socket directory - file: - path: "{{ postgresql_unix_socket_directories }}" - mode: '0770' - group: postgres - become: true - - name: Test PostgreSQL socket connectivity postgresql_ping: login_unix_socket: "{{ postgresql_unix_socket_directories }}" diff --git a/roles/authentik/tasks/main.yml b/roles/authentik/tasks/main.yml index 97ca8c5..29a83e3 100644 --- a/roles/authentik/tasks/main.yml +++ b/roles/authentik/tasks/main.yml @@ -2,6 +2,16 @@ # Authentik Authentication Role - Main Tasks # Self-contained deployment with Podman and Unix sockets +- name: Validate infrastructure facts are available + assert: + that: + - postgresql_client_group_gid is defined + - valkey_client_group_gid is defined + fail_msg: | + Required infrastructure facts are not available. + Ensure PostgreSQL and Valkey roles have run and exported client group GIDs. + tags: [validation] + - name: Setup authentik user and container namespaces include_tasks: user.yml tags: [user, setup] @@ -18,8 +28,6 @@ containers.podman.podman_image: name: "{{ authentik_image }}:{{ authentik_version }}" state: present - become: true - become_user: "{{ authentik_user }}" tags: [containers, image-pull] - name: Create media directory structure @@ -48,29 +56,23 @@ - restart authentik worker tags: [config] -- name: Create Quadlet systemd directory (user scope) +- name: Create Quadlet systemd directory (system scope) file: - path: "{{ authentik_quadlet_dir }}" + path: /etc/containers/systemd state: directory - owner: "{{ authentik_user }}" - group: "{{ authentik_group }}" mode: '0755' -- name: Deploy Quadlet pod and container files (user scope) +- name: Deploy Quadlet pod and container files (system scope) template: src: "{{ item.src }}" - dest: "{{ authentik_quadlet_dir }}/{{ item.dest }}" - owner: "{{ authentik_user }}" - group: "{{ authentik_group }}" + dest: "/etc/containers/systemd/{{ item.dest }}" mode: '0644' loop: - { src: 'authentik.pod', dest: 'authentik.pod' } - { src: 'authentik-server.container', dest: 'authentik-server.container' } - { src: 'authentik-worker.container', dest: 'authentik-worker.container' } - become: true - become_user: "{{ authentik_user }}" notify: - - reload systemd user + - reload systemd - restart authentik pod - restart authentik server - restart authentik worker @@ -108,22 +110,12 @@ timeout: 30 when: valkey_unix_socket_enabled -- name: Ensure systemd user session is started - systemd: - name: "user@{{ authentik_uid }}.service" - state: started - scope: system - register: user_session_start - -- name: Enable and start Authentik pod (user scope) +- name: Enable and start Authentik pod (system scope) systemd: name: "authentik-pod" enabled: "{{ authentik_service_enabled }}" state: "{{ authentik_service_state }}" - scope: user daemon_reload: true - become: true - become_user: "{{ authentik_user }}" tags: [containers, service] - name: Wait for Authentik to be ready diff --git a/roles/authentik/tasks/user.yml b/roles/authentik/tasks/user.yml index 62cb0b3..8b0df05 100644 --- a/roles/authentik/tasks/user.yml +++ b/roles/authentik/tasks/user.yml @@ -10,25 +10,13 @@ user: name: "{{ authentik_user }}" group: "{{ authentik_group }}" + groups: "{{ [postgresql_client_group, valkey_client_group] }}" system: true shell: /bin/bash home: "{{ authentik_home }}" create_home: true comment: "Authentik authentication service" - -- name: Set up subuid for authentik user - lineinfile: - path: /etc/subuid - line: "{{ authentik_user }}:{{ authentik_subuid_start }}:{{ authentik_subuid_size }}" - create: true - mode: '0644' - -- name: Set up subgid for authentik user - lineinfile: - path: /etc/subgid - line: "{{ authentik_user }}:{{ authentik_subgid_start }}:{{ authentik_subgid_size }}" - create: true - mode: '0644' + append: true - name: Create authentik directories file: @@ -39,27 +27,10 @@ mode: '0755' loop: - "{{ authentik_home }}" - - "{{ authentik_home }}/.config" - - "{{ authentik_home }}/.config/systemd" - - "{{ authentik_home }}/.config/systemd/user" - - "{{ authentik_home }}/.config/containers" - - "{{ authentik_home }}/.config/containers/systemd" - "{{ authentik_home }}/data" - "{{ authentik_home }}/media" - "{{ authentik_home }}/logs" -- name: Enable lingering for authentik user - command: loginctl enable-linger {{ authentik_user }} - args: - creates: "/var/lib/systemd/linger/{{ authentik_user }}" - -- name: Initialize user systemd for authentik - systemd: - daemon_reload: true - scope: user - become: true - become_user: "{{ authentik_user }}" - - name: Get authentik user UID and GID for container configuration shell: | echo "uid=$(id -u {{ authentik_user }})" diff --git a/roles/authentik/templates/authentik-server.container b/roles/authentik/templates/authentik-server.container index 1cdbef5..8d73f93 100644 --- a/roles/authentik/templates/authentik-server.container +++ b/roles/authentik/templates/authentik-server.container @@ -9,12 +9,7 @@ Image={{ authentik_image }}:{{ authentik_version }} Pod=authentik.pod EnvironmentFile={{ authentik_home }}/.env User={{ authentik_uid }}:{{ authentik_gid }} -Annotation=run.oci.keep_original_groups=1 - -# Security configuration for shared memory and IPC -Volume=/dev/shm:/dev/shm:rw -SecurityLabelDisable=true -AddCapability=IPC_OWNER +PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }} # Logging configuration LogDriver=k8s-file @@ -34,4 +29,4 @@ Restart=always TimeoutStartSec=300 [Install] -WantedBy=default.target +WantedBy=multi-user.target diff --git a/roles/authentik/templates/authentik-worker.container b/roles/authentik/templates/authentik-worker.container index 87d2d3c..20d62c5 100644 --- a/roles/authentik/templates/authentik-worker.container +++ b/roles/authentik/templates/authentik-worker.container @@ -9,12 +9,7 @@ Image={{ authentik_image }}:{{ authentik_version }} Pod=authentik.pod EnvironmentFile={{ authentik_home }}/.env User={{ authentik_uid }}:{{ authentik_gid }} -Annotation=run.oci.keep_original_groups=1 - -# Security configuration for shared memory and IPC -Volume=/dev/shm:/dev/shm:rw -SecurityLabelDisable=true -AddCapability=IPC_OWNER +PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }} # Logging configuration LogDriver=k8s-file @@ -34,4 +29,4 @@ Restart=always TimeoutStartSec=300 [Install] -WantedBy=default.target +WantedBy=multi-user.target diff --git a/roles/authentik/templates/authentik.pod b/roles/authentik/templates/authentik.pod index b3934dd..a4d7e79 100644 --- a/roles/authentik/templates/authentik.pod +++ b/roles/authentik/templates/authentik.pod @@ -4,11 +4,10 @@ Description=Authentik Authentication Pod [Pod] PublishPort=0.0.0.0:{{ authentik_http_port }}:{{ authentik_http_port }} ShmSize=256m -PodmanArgs= [Service] Restart=always TimeoutStartSec=900 [Install] -WantedBy=default.target +WantedBy=multi-user.target diff --git a/roles/postgresql/defaults/main.yml b/roles/postgresql/defaults/main.yml index eaa1333..6e26579 100644 --- a/roles/postgresql/defaults/main.yml +++ b/roles/postgresql/defaults/main.yml @@ -22,6 +22,10 @@ postgresql_unix_socket_enabled: true postgresql_unix_socket_directories: "/var/run/postgresql" postgresql_unix_socket_permissions: "0770" +# Group-Based Access Control +postgresql_client_group: "postgres-clients" +postgresql_client_group_create: true + # Authentication postgresql_auth_method: "scram-sha-256" diff --git a/roles/postgresql/tasks/main.yml b/roles/postgresql/tasks/main.yml index eeab7c2..4741ee6 100644 --- a/roles/postgresql/tasks/main.yml +++ b/roles/postgresql/tasks/main.yml @@ -11,6 +11,19 @@ name: python-psycopg2 state: present +- name: Create PostgreSQL client access group + group: + name: "{{ postgresql_client_group }}" + system: true + when: postgresql_client_group_create + +- name: Ensure postgres user is in client group + user: + name: postgres + groups: "{{ postgresql_client_group }}" + append: true + when: postgresql_client_group_create + - name: Check if PostgreSQL data directory exists and is initialized stat: path: "/var/lib/postgres/data/PG_VERSION" @@ -72,10 +85,21 @@ path: "{{ postgresql_unix_socket_directories }}" state: directory owner: postgres - group: postgres + group: "{{ postgresql_client_group }}" mode: '0770' when: postgresql_unix_socket_enabled +- name: Get PostgreSQL client group GID for containerized applications + shell: "getent group {{ postgresql_client_group }} | cut -d: -f3" + register: postgresql_client_group_lookup + changed_when: false + when: postgresql_client_group_create + +- name: Set PostgreSQL client group GID as fact + set_fact: + postgresql_client_group_gid: "{{ postgresql_client_group_lookup.stdout }}" + when: postgresql_client_group_create and postgresql_client_group_lookup.stdout is defined + - name: Enable and start PostgreSQL service systemd: name: postgresql diff --git a/roles/postgresql/templates/postgresql.conf.j2 b/roles/postgresql/templates/postgresql.conf.j2 index 0181d58..3fb1f16 100644 --- a/roles/postgresql/templates/postgresql.conf.j2 +++ b/roles/postgresql/templates/postgresql.conf.j2 @@ -7,14 +7,11 @@ # Unix Socket Configuration unix_socket_directories = '{{ postgresql_unix_socket_directories }}' unix_socket_permissions = {{ postgresql_unix_socket_permissions }} +unix_socket_group = '{{ postgresql_client_group }}' {% endif %} listen_addresses = '{{ postgresql_listen_addresses }}' port = {{ postgresql_port }} -# Unix socket configuration -unix_socket_directories = '{{ postgresql_unix_socket_directories | default("/run/postgresql") }}' -unix_socket_permissions = {{ postgresql_unix_socket_permissions | default("0777") }} - # Basic Performance (only override if needed) max_connections = {{ postgresql_max_connections }} shared_buffers = {{ postgresql_shared_buffers }} diff --git a/roles/valkey/defaults/main.yml b/roles/valkey/defaults/main.yml index b9e9f32..0428d2a 100644 --- a/roles/valkey/defaults/main.yml +++ b/roles/valkey/defaults/main.yml @@ -13,20 +13,20 @@ valkey_service_enabled: true valkey_service_state: "started" -# Network Security (Unix socket with localhost TCP for compatibility) -valkey_bind: "127.0.0.1" # Listen on localhost for apps that don't support Unix sockets -valkey_port: 6379 # Keep TCP port for compatibility -valkey_protected_mode: true # Enable protection for TCP - -# Unix socket configuration (also enabled for better performance) -valkey_unixsocket: "/run/valkey/valkey.sock" -valkey_unixsocketperm: 777 # Allows container access +# Network Security (Unix socket only - no TCP) +valkey_bind: "" # Disable TCP, socket-only mode +valkey_port: 0 # Disable TCP port +valkey_protected_mode: false # Not needed for socket-only mode # Unix Socket Configuration valkey_unix_socket_enabled: true valkey_unix_socket_path: "/var/run/valkey/valkey.sock" valkey_unix_socket_perm: "770" +# Group-Based Access Control +valkey_client_group: "valkey-clients" +valkey_client_group_create: true + # Authentication valkey_password: "{{ vault_valkey_password }}" diff --git a/roles/valkey/tasks/main.yml b/roles/valkey/tasks/main.yml index 0978ead..f0eba18 100644 --- a/roles/valkey/tasks/main.yml +++ b/roles/valkey/tasks/main.yml @@ -9,6 +9,19 @@ # Note: Arch Linux's redis package (which provides Valkey) creates the 'valkey' user automatically # We don't need to create users - just ensure data directory permissions +- name: Create Valkey client access group + group: + name: "{{ valkey_client_group }}" + system: true + when: valkey_client_group_create + +- name: Ensure valkey user is in client group + user: + name: valkey + groups: "{{ valkey_client_group }}" + append: true + when: valkey_client_group_create + - name: Create Valkey configuration directory file: path: /etc/valkey @@ -33,17 +46,8 @@ path: "{{ valkey_unix_socket_path | dirname }}" state: directory owner: valkey - group: valkey - mode: '0775' - when: valkey_unix_socket_enabled - -- name: Ensure socket directory is accessible - file: - path: "{{ valkey_unix_socket_path | dirname }}" - owner: valkey - group: valkey - mode: '0775' - recurse: yes + group: "{{ valkey_client_group }}" + mode: '0770' when: valkey_unix_socket_enabled - name: Deploy Valkey configuration file @@ -56,6 +60,43 @@ backup: yes notify: restart valkey +- name: Deploy Valkey systemd service file (with socket group management) + template: + src: valkey.service.j2 + dest: /etc/systemd/system/valkey.service + mode: '0644' + notify: + - reload systemd + - restart valkey + when: valkey_client_group_create + +- name: Deploy Valkey socket group fix service + template: + src: valkey-socket-fix.service.j2 + dest: /etc/systemd/system/valkey-socket-fix.service + mode: '0644' + notify: + - reload systemd + when: valkey_client_group_create and valkey_unix_socket_enabled + +- name: Enable Valkey socket group fix service + systemd: + name: valkey-socket-fix + enabled: true + daemon_reload: true + when: valkey_client_group_create and valkey_unix_socket_enabled + +- name: Get Valkey client group GID for containerized applications + shell: "getent group {{ valkey_client_group }} | cut -d: -f3" + register: valkey_client_group_lookup + changed_when: false + when: valkey_client_group_create + +- name: Set Valkey client group GID as fact + set_fact: + valkey_client_group_gid: "{{ valkey_client_group_lookup.stdout }}" + when: valkey_client_group_create and valkey_client_group_lookup.stdout is defined + - name: Enable and start Valkey service systemd: name: valkey @@ -64,13 +105,6 @@ daemon_reload: true register: valkey_service_result -- name: Wait for Valkey to be ready (TCP) - wait_for: - port: "{{ valkey_port }}" - host: "{{ valkey_bind }}" - timeout: 30 - when: valkey_service_state == "started" and not valkey_unix_socket_enabled - - name: Wait for Valkey socket file to exist wait_for: path: "{{ valkey_unix_socket_path }}" @@ -102,13 +136,6 @@ (valkey_socket_ping_noauth.stdout != "PONG") and ("NOAUTH" in (valkey_socket_ping_noauth.stdout + valkey_socket_ping_noauth.stderr) or valkey_socket_ping_noauth.rc != 0) -- name: Test Valkey connectivity (TCP) - command: redis-cli -h {{ valkey_bind }} -p {{ valkey_port }} -a {{ valkey_password }} ping - register: valkey_ping_result_tcp - changed_when: false - failed_when: valkey_ping_result_tcp.stdout != "PONG" - when: valkey_service_state == "started" and not valkey_unix_socket_enabled - - name: Test Valkey connectivity (Unix Socket) command: redis-cli -s {{ valkey_unix_socket_path }} -a {{ valkey_password }} ping register: valkey_ping_result_socket diff --git a/roles/valkey/templates/valkey-socket-fix.service.j2 b/roles/valkey/templates/valkey-socket-fix.service.j2 new file mode 100644 index 0000000..3560b13 --- /dev/null +++ b/roles/valkey/templates/valkey-socket-fix.service.j2 @@ -0,0 +1,15 @@ +[Unit] +Description=Fix Valkey socket group ownership and permissions +BindsTo=valkey.service +After=valkey.service + +[Service] +Type=oneshot +# Wait for socket to exist (max 10 seconds) +ExecStart=/bin/sh -c 'i=0; while [ ! -S {{ valkey_unix_socket_path }} ] && [ $i -lt 100 ]; do sleep 0.1; i=$((i+1)); done' +ExecStart=/bin/chgrp {{ valkey_client_group }} {{ valkey_unix_socket_path }} +ExecStart=/bin/chmod 770 {{ valkey_unix_socket_path }} +RemainAfterExit=yes + +[Install] +WantedBy=multi-user.target diff --git a/roles/valkey/templates/valkey.conf.j2 b/roles/valkey/templates/valkey.conf.j2 index 15b8308..afc22aa 100644 --- a/roles/valkey/templates/valkey.conf.j2 +++ b/roles/valkey/templates/valkey.conf.j2 @@ -8,22 +8,19 @@ # Network Configuration # ================================================================= -# Bind to localhost only for security (like PostgreSQL) +# Socket-only mode - TCP disabled for security +{% if valkey_bind %} bind {{ valkey_bind }} - -# Valkey port +{% endif %} port {{ valkey_port }} {% if valkey_unix_socket_enabled %} # Unix Socket Configuration unixsocket {{ valkey_unix_socket_path }} unixsocketperm {{ valkey_unix_socket_perm }} - -# Enable both TCP and Unix socket (for compatibility during transition) -# To disable TCP completely, comment out the port line above {% endif %} -# Protected mode - requires authentication +# Protected mode protected-mode {{ 'yes' if valkey_protected_mode else 'no' }} # Connection timeout diff --git a/roles/valkey/templates/valkey.service.j2 b/roles/valkey/templates/valkey.service.j2 index 70b052b..14ae375 100644 --- a/roles/valkey/templates/valkey.service.j2 +++ b/roles/valkey/templates/valkey.service.j2 @@ -8,6 +8,9 @@ Description=Valkey (Redis-compatible) Key-Value Store Documentation=https://valkey.io/ After=network.target Wants=network-online.target +{% if valkey_unix_socket_enabled and valkey_client_group_create %} +Wants=valkey-socket-fix.service +{% endif %} [Service] Type=notify diff --git a/site.yml b/site.yml index 346fc93..d310b49 100644 --- a/site.yml +++ b/site.yml @@ -21,7 +21,7 @@ # Application services # - role: sigvild-gallery # tags: ['sigvild', 'gallery', 'wedding'] - - role: gitea - tags: ['gitea', 'git', 'development'] + # - role: gitea + # tags: ['gitea', 'git', 'development'] - role: authentik tags: ['authentik']