Migrate to rootful container architecture with infrastructure fact pattern

Major architectural change from rootless user services to system-level (rootful)
containers to enable group-based Unix socket access for containerized applications.

Infrastructure Changes:
- PostgreSQL: Export postgres-clients group GID as Ansible fact
- Valkey: Export valkey-clients group GID as Ansible fact
- Valkey: Add socket-fix service to maintain correct socket group ownership
- Both: Set socket directories to 770 with client group ownership

Authentik Role Refactoring:
- Remove rootless container configuration (subuid/subgid, lingering, user systemd)
- Deploy Quadlet files to /etc/containers/systemd/ (system-level)
- Use dynamic GID facts in container PodmanArgs (--group-add)
- Simplify user creation to system user with infrastructure group membership
- Update handlers for system scope service management
- Remove unnecessary container security options (no user namespace isolation)

Container Template Changes:
- Pod: Remove --userns args, change WantedBy to multi-user.target
- Containers: Replace Annotation with PodmanArgs using dynamic GIDs
- Remove /dev/shm mounts and SecurityLabelDisable (not needed for rootful)
- Change WantedBy to multi-user.target for system services

Documentation Updates:
- Add ADR-005: Rootful Containers with Infrastructure Fact Pattern
- Update ADR-003: Podman + systemd for system-level deployment
- Update authentik-deployment-guide.md for system scope commands
- Update service-integration-guide.md with rootful pattern examples
- Document discarded rootless approach and rationale

Why Rootful Succeeds:
- Direct UID/GID mapping preserves supplementary groups
- Container process groups match host socket group ownership
- No user namespace remapping breaking permissions

Why Rootless Failed (Discarded):
- User namespace UID/GID remapping broke group-based socket access
- Supplementary groups remapped into subgid range didn't match socket ownership
- Even with --userns=host and keep_original_groups, permissions failed

Pattern Established:
- Infrastructure roles create client groups and export GID facts
- Application roles validate facts and consume in container templates
- Rootful containers run as dedicated users with --group-add for socket access
- System-level deployment provides standard systemd service management

Deployment Validated:
- Services in /system.slice/ ✓
- Process groups: 961 (valkey-clients), 962 (postgres-clients), 966 (authentik) ✓
- Socket permissions: 770 with client groups ✓
- HTTP endpoint responding ✓
This commit is contained in:
2025-12-14 16:56:50 +01:00
parent 9e570ac2a3
commit 3506e55016
21 changed files with 587 additions and 288 deletions

View File

@@ -82,24 +82,9 @@ authentik_default_admin_password: "{{ vault_authentik_admin_password }}"
authentik_container_server_name: "authentik-server"
authentik_container_worker_name: "authentik-worker"
# Quadlet service directories (USER SCOPE)
authentik_quadlet_dir: "{{ authentik_user_quadlet_dir }}"
authentik_user_quadlet_dir: "{{ authentik_home }}/.config/containers/systemd"
# User session variables (set dynamically during deployment)
authentik_uid: ""
# =================================================================
# User Namespace Configuration
# =================================================================
# Subuid/subgid ranges for authentik user containers
# Range: 200000-265535 (65536 IDs)
authentik_subuid_start: 200000
authentik_subuid_size: 65536
authentik_subgid_start: 200000
authentik_subgid_size: 65536
# =================================================================
# Caddy Integration
# =================================================================
@@ -115,7 +100,9 @@ caddy_user: "caddy"
# PostgreSQL socket configuration (managed by postgresql role)
postgresql_unix_socket_directories: "/var/run/postgresql"
postgresql_client_group: "postgres-clients"
# Valkey socket configuration (managed by valkey role)
valkey_unix_socket_path: "/var/run/valkey/valkey.sock"
valkey_password: "{{ vault_valkey_password }}"
valkey_client_group: "valkey-clients"

View File

@@ -1,14 +1,9 @@
---
# Authentik Service Handlers (User Scope)
# Authentik Service Handlers (System Scope)
- name: reload systemd user
- name: reload systemd
systemd:
daemon_reload: true
scope: user
become: true
become_user: "{{ authentik_user }}"
environment:
XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}"
- name: reload caddy
systemd:
@@ -19,44 +14,24 @@
systemd:
name: "authentik-pod"
state: restarted
scope: user
daemon_reload: true
become: true
become_user: "{{ authentik_user }}"
environment:
XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}"
- name: restart authentik server
systemd:
name: "{{ authentik_container_server_name }}"
state: restarted
scope: user
daemon_reload: true
become: true
become_user: "{{ authentik_user }}"
environment:
XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}"
- name: restart authentik worker
systemd:
name: "{{ authentik_container_worker_name }}"
state: restarted
scope: user
daemon_reload: true
become: true
become_user: "{{ authentik_user }}"
environment:
XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}"
- name: stop authentik services
systemd:
name: "{{ item }}"
state: stopped
scope: user
become: true
become_user: "{{ authentik_user }}"
environment:
XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}"
loop:
- "{{ authentik_container_worker_name }}"
- "{{ authentik_container_server_name }}"
@@ -66,12 +41,7 @@
systemd:
name: "{{ item }}"
state: started
scope: user
daemon_reload: true
become: true
become_user: "{{ authentik_user }}"
environment:
XDG_RUNTIME_DIR: "/run/user/{{ authentik_uid }}"
loop:
- "authentik-pod"
- "{{ authentik_container_server_name }}"

View File

@@ -1,19 +1,12 @@
---
# Cache setup for Authentik - Self-contained socket permissions
- name: Add authentik user to valkey group for socket access
- name: Add authentik user to Valkey client group for socket access
user:
name: "{{ authentik_user }}"
groups: valkey
groups: "{{ valkey_client_group }}"
append: true
- name: Ensure authentik can access Valkey socket directory
file:
path: "{{ valkey_unix_socket_path | dirname }}"
mode: '0770'
group: valkey
become: true
- name: Test Valkey socket connectivity
command: >
redis-cli -s {{ valkey_unix_socket_path }}

View File

@@ -1,19 +1,12 @@
---
# Database setup for Authentik - Self-contained socket permissions
- name: Add authentik user to postgres group for socket access
- name: Add authentik user to PostgreSQL client group for socket access
user:
name: "{{ authentik_user }}"
groups: postgres
groups: "{{ postgresql_client_group }}"
append: true
- name: Ensure authentik can access PostgreSQL socket directory
file:
path: "{{ postgresql_unix_socket_directories }}"
mode: '0770'
group: postgres
become: true
- name: Test PostgreSQL socket connectivity
postgresql_ping:
login_unix_socket: "{{ postgresql_unix_socket_directories }}"

View File

@@ -2,6 +2,16 @@
# Authentik Authentication Role - Main Tasks
# Self-contained deployment with Podman and Unix sockets
- name: Validate infrastructure facts are available
assert:
that:
- postgresql_client_group_gid is defined
- valkey_client_group_gid is defined
fail_msg: |
Required infrastructure facts are not available.
Ensure PostgreSQL and Valkey roles have run and exported client group GIDs.
tags: [validation]
- name: Setup authentik user and container namespaces
include_tasks: user.yml
tags: [user, setup]
@@ -18,8 +28,6 @@
containers.podman.podman_image:
name: "{{ authentik_image }}:{{ authentik_version }}"
state: present
become: true
become_user: "{{ authentik_user }}"
tags: [containers, image-pull]
- name: Create media directory structure
@@ -48,29 +56,23 @@
- restart authentik worker
tags: [config]
- name: Create Quadlet systemd directory (user scope)
- name: Create Quadlet systemd directory (system scope)
file:
path: "{{ authentik_quadlet_dir }}"
path: /etc/containers/systemd
state: directory
owner: "{{ authentik_user }}"
group: "{{ authentik_group }}"
mode: '0755'
- name: Deploy Quadlet pod and container files (user scope)
- name: Deploy Quadlet pod and container files (system scope)
template:
src: "{{ item.src }}"
dest: "{{ authentik_quadlet_dir }}/{{ item.dest }}"
owner: "{{ authentik_user }}"
group: "{{ authentik_group }}"
dest: "/etc/containers/systemd/{{ item.dest }}"
mode: '0644'
loop:
- { src: 'authentik.pod', dest: 'authentik.pod' }
- { src: 'authentik-server.container', dest: 'authentik-server.container' }
- { src: 'authentik-worker.container', dest: 'authentik-worker.container' }
become: true
become_user: "{{ authentik_user }}"
notify:
- reload systemd user
- reload systemd
- restart authentik pod
- restart authentik server
- restart authentik worker
@@ -108,22 +110,12 @@
timeout: 30
when: valkey_unix_socket_enabled
- name: Ensure systemd user session is started
systemd:
name: "user@{{ authentik_uid }}.service"
state: started
scope: system
register: user_session_start
- name: Enable and start Authentik pod (user scope)
- name: Enable and start Authentik pod (system scope)
systemd:
name: "authentik-pod"
enabled: "{{ authentik_service_enabled }}"
state: "{{ authentik_service_state }}"
scope: user
daemon_reload: true
become: true
become_user: "{{ authentik_user }}"
tags: [containers, service]
- name: Wait for Authentik to be ready

View File

@@ -10,25 +10,13 @@
user:
name: "{{ authentik_user }}"
group: "{{ authentik_group }}"
groups: "{{ [postgresql_client_group, valkey_client_group] }}"
system: true
shell: /bin/bash
home: "{{ authentik_home }}"
create_home: true
comment: "Authentik authentication service"
- name: Set up subuid for authentik user
lineinfile:
path: /etc/subuid
line: "{{ authentik_user }}:{{ authentik_subuid_start }}:{{ authentik_subuid_size }}"
create: true
mode: '0644'
- name: Set up subgid for authentik user
lineinfile:
path: /etc/subgid
line: "{{ authentik_user }}:{{ authentik_subgid_start }}:{{ authentik_subgid_size }}"
create: true
mode: '0644'
append: true
- name: Create authentik directories
file:
@@ -39,27 +27,10 @@
mode: '0755'
loop:
- "{{ authentik_home }}"
- "{{ authentik_home }}/.config"
- "{{ authentik_home }}/.config/systemd"
- "{{ authentik_home }}/.config/systemd/user"
- "{{ authentik_home }}/.config/containers"
- "{{ authentik_home }}/.config/containers/systemd"
- "{{ authentik_home }}/data"
- "{{ authentik_home }}/media"
- "{{ authentik_home }}/logs"
- name: Enable lingering for authentik user
command: loginctl enable-linger {{ authentik_user }}
args:
creates: "/var/lib/systemd/linger/{{ authentik_user }}"
- name: Initialize user systemd for authentik
systemd:
daemon_reload: true
scope: user
become: true
become_user: "{{ authentik_user }}"
- name: Get authentik user UID and GID for container configuration
shell: |
echo "uid=$(id -u {{ authentik_user }})"

View File

@@ -9,12 +9,7 @@ Image={{ authentik_image }}:{{ authentik_version }}
Pod=authentik.pod
EnvironmentFile={{ authentik_home }}/.env
User={{ authentik_uid }}:{{ authentik_gid }}
Annotation=run.oci.keep_original_groups=1
# Security configuration for shared memory and IPC
Volume=/dev/shm:/dev/shm:rw
SecurityLabelDisable=true
AddCapability=IPC_OWNER
PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }}
# Logging configuration
LogDriver=k8s-file
@@ -34,4 +29,4 @@ Restart=always
TimeoutStartSec=300
[Install]
WantedBy=default.target
WantedBy=multi-user.target

View File

@@ -9,12 +9,7 @@ Image={{ authentik_image }}:{{ authentik_version }}
Pod=authentik.pod
EnvironmentFile={{ authentik_home }}/.env
User={{ authentik_uid }}:{{ authentik_gid }}
Annotation=run.oci.keep_original_groups=1
# Security configuration for shared memory and IPC
Volume=/dev/shm:/dev/shm:rw
SecurityLabelDisable=true
AddCapability=IPC_OWNER
PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }}
# Logging configuration
LogDriver=k8s-file
@@ -34,4 +29,4 @@ Restart=always
TimeoutStartSec=300
[Install]
WantedBy=default.target
WantedBy=multi-user.target

View File

@@ -4,11 +4,10 @@ Description=Authentik Authentication Pod
[Pod]
PublishPort=0.0.0.0:{{ authentik_http_port }}:{{ authentik_http_port }}
ShmSize=256m
PodmanArgs=
[Service]
Restart=always
TimeoutStartSec=900
[Install]
WantedBy=default.target
WantedBy=multi-user.target

View File

@@ -22,6 +22,10 @@ postgresql_unix_socket_enabled: true
postgresql_unix_socket_directories: "/var/run/postgresql"
postgresql_unix_socket_permissions: "0770"
# Group-Based Access Control
postgresql_client_group: "postgres-clients"
postgresql_client_group_create: true
# Authentication
postgresql_auth_method: "scram-sha-256"

View File

@@ -11,6 +11,19 @@
name: python-psycopg2
state: present
- name: Create PostgreSQL client access group
group:
name: "{{ postgresql_client_group }}"
system: true
when: postgresql_client_group_create
- name: Ensure postgres user is in client group
user:
name: postgres
groups: "{{ postgresql_client_group }}"
append: true
when: postgresql_client_group_create
- name: Check if PostgreSQL data directory exists and is initialized
stat:
path: "/var/lib/postgres/data/PG_VERSION"
@@ -72,10 +85,21 @@
path: "{{ postgresql_unix_socket_directories }}"
state: directory
owner: postgres
group: postgres
group: "{{ postgresql_client_group }}"
mode: '0770'
when: postgresql_unix_socket_enabled
- name: Get PostgreSQL client group GID for containerized applications
shell: "getent group {{ postgresql_client_group }} | cut -d: -f3"
register: postgresql_client_group_lookup
changed_when: false
when: postgresql_client_group_create
- name: Set PostgreSQL client group GID as fact
set_fact:
postgresql_client_group_gid: "{{ postgresql_client_group_lookup.stdout }}"
when: postgresql_client_group_create and postgresql_client_group_lookup.stdout is defined
- name: Enable and start PostgreSQL service
systemd:
name: postgresql

View File

@@ -7,14 +7,11 @@
# Unix Socket Configuration
unix_socket_directories = '{{ postgresql_unix_socket_directories }}'
unix_socket_permissions = {{ postgresql_unix_socket_permissions }}
unix_socket_group = '{{ postgresql_client_group }}'
{% endif %}
listen_addresses = '{{ postgresql_listen_addresses }}'
port = {{ postgresql_port }}
# Unix socket configuration
unix_socket_directories = '{{ postgresql_unix_socket_directories | default("/run/postgresql") }}'
unix_socket_permissions = {{ postgresql_unix_socket_permissions | default("0777") }}
# Basic Performance (only override if needed)
max_connections = {{ postgresql_max_connections }}
shared_buffers = {{ postgresql_shared_buffers }}

View File

@@ -13,20 +13,20 @@
valkey_service_enabled: true
valkey_service_state: "started"
# Network Security (Unix socket with localhost TCP for compatibility)
valkey_bind: "127.0.0.1" # Listen on localhost for apps that don't support Unix sockets
valkey_port: 6379 # Keep TCP port for compatibility
valkey_protected_mode: true # Enable protection for TCP
# Unix socket configuration (also enabled for better performance)
valkey_unixsocket: "/run/valkey/valkey.sock"
valkey_unixsocketperm: 777 # Allows container access
# Network Security (Unix socket only - no TCP)
valkey_bind: "" # Disable TCP, socket-only mode
valkey_port: 0 # Disable TCP port
valkey_protected_mode: false # Not needed for socket-only mode
# Unix Socket Configuration
valkey_unix_socket_enabled: true
valkey_unix_socket_path: "/var/run/valkey/valkey.sock"
valkey_unix_socket_perm: "770"
# Group-Based Access Control
valkey_client_group: "valkey-clients"
valkey_client_group_create: true
# Authentication
valkey_password: "{{ vault_valkey_password }}"

View File

@@ -9,6 +9,19 @@
# Note: Arch Linux's redis package (which provides Valkey) creates the 'valkey' user automatically
# We don't need to create users - just ensure data directory permissions
- name: Create Valkey client access group
group:
name: "{{ valkey_client_group }}"
system: true
when: valkey_client_group_create
- name: Ensure valkey user is in client group
user:
name: valkey
groups: "{{ valkey_client_group }}"
append: true
when: valkey_client_group_create
- name: Create Valkey configuration directory
file:
path: /etc/valkey
@@ -33,17 +46,8 @@
path: "{{ valkey_unix_socket_path | dirname }}"
state: directory
owner: valkey
group: valkey
mode: '0775'
when: valkey_unix_socket_enabled
- name: Ensure socket directory is accessible
file:
path: "{{ valkey_unix_socket_path | dirname }}"
owner: valkey
group: valkey
mode: '0775'
recurse: yes
group: "{{ valkey_client_group }}"
mode: '0770'
when: valkey_unix_socket_enabled
- name: Deploy Valkey configuration file
@@ -56,6 +60,43 @@
backup: yes
notify: restart valkey
- name: Deploy Valkey systemd service file (with socket group management)
template:
src: valkey.service.j2
dest: /etc/systemd/system/valkey.service
mode: '0644'
notify:
- reload systemd
- restart valkey
when: valkey_client_group_create
- name: Deploy Valkey socket group fix service
template:
src: valkey-socket-fix.service.j2
dest: /etc/systemd/system/valkey-socket-fix.service
mode: '0644'
notify:
- reload systemd
when: valkey_client_group_create and valkey_unix_socket_enabled
- name: Enable Valkey socket group fix service
systemd:
name: valkey-socket-fix
enabled: true
daemon_reload: true
when: valkey_client_group_create and valkey_unix_socket_enabled
- name: Get Valkey client group GID for containerized applications
shell: "getent group {{ valkey_client_group }} | cut -d: -f3"
register: valkey_client_group_lookup
changed_when: false
when: valkey_client_group_create
- name: Set Valkey client group GID as fact
set_fact:
valkey_client_group_gid: "{{ valkey_client_group_lookup.stdout }}"
when: valkey_client_group_create and valkey_client_group_lookup.stdout is defined
- name: Enable and start Valkey service
systemd:
name: valkey
@@ -64,13 +105,6 @@
daemon_reload: true
register: valkey_service_result
- name: Wait for Valkey to be ready (TCP)
wait_for:
port: "{{ valkey_port }}"
host: "{{ valkey_bind }}"
timeout: 30
when: valkey_service_state == "started" and not valkey_unix_socket_enabled
- name: Wait for Valkey socket file to exist
wait_for:
path: "{{ valkey_unix_socket_path }}"
@@ -102,13 +136,6 @@
(valkey_socket_ping_noauth.stdout != "PONG") and
("NOAUTH" in (valkey_socket_ping_noauth.stdout + valkey_socket_ping_noauth.stderr) or valkey_socket_ping_noauth.rc != 0)
- name: Test Valkey connectivity (TCP)
command: redis-cli -h {{ valkey_bind }} -p {{ valkey_port }} -a {{ valkey_password }} ping
register: valkey_ping_result_tcp
changed_when: false
failed_when: valkey_ping_result_tcp.stdout != "PONG"
when: valkey_service_state == "started" and not valkey_unix_socket_enabled
- name: Test Valkey connectivity (Unix Socket)
command: redis-cli -s {{ valkey_unix_socket_path }} -a {{ valkey_password }} ping
register: valkey_ping_result_socket

View File

@@ -0,0 +1,15 @@
[Unit]
Description=Fix Valkey socket group ownership and permissions
BindsTo=valkey.service
After=valkey.service
[Service]
Type=oneshot
# Wait for socket to exist (max 10 seconds)
ExecStart=/bin/sh -c 'i=0; while [ ! -S {{ valkey_unix_socket_path }} ] && [ $i -lt 100 ]; do sleep 0.1; i=$((i+1)); done'
ExecStart=/bin/chgrp {{ valkey_client_group }} {{ valkey_unix_socket_path }}
ExecStart=/bin/chmod 770 {{ valkey_unix_socket_path }}
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target

View File

@@ -8,22 +8,19 @@
# Network Configuration
# =================================================================
# Bind to localhost only for security (like PostgreSQL)
# Socket-only mode - TCP disabled for security
{% if valkey_bind %}
bind {{ valkey_bind }}
# Valkey port
{% endif %}
port {{ valkey_port }}
{% if valkey_unix_socket_enabled %}
# Unix Socket Configuration
unixsocket {{ valkey_unix_socket_path }}
unixsocketperm {{ valkey_unix_socket_perm }}
# Enable both TCP and Unix socket (for compatibility during transition)
# To disable TCP completely, comment out the port line above
{% endif %}
# Protected mode - requires authentication
# Protected mode
protected-mode {{ 'yes' if valkey_protected_mode else 'no' }}
# Connection timeout

View File

@@ -8,6 +8,9 @@ Description=Valkey (Redis-compatible) Key-Value Store
Documentation=https://valkey.io/
After=network.target
Wants=network-online.target
{% if valkey_unix_socket_enabled and valkey_client_group_create %}
Wants=valkey-socket-fix.service
{% endif %}
[Service]
Type=notify