Migrate to rootful container architecture with infrastructure fact pattern
Major architectural change from rootless user services to system-level (rootful) containers to enable group-based Unix socket access for containerized applications. Infrastructure Changes: - PostgreSQL: Export postgres-clients group GID as Ansible fact - Valkey: Export valkey-clients group GID as Ansible fact - Valkey: Add socket-fix service to maintain correct socket group ownership - Both: Set socket directories to 770 with client group ownership Authentik Role Refactoring: - Remove rootless container configuration (subuid/subgid, lingering, user systemd) - Deploy Quadlet files to /etc/containers/systemd/ (system-level) - Use dynamic GID facts in container PodmanArgs (--group-add) - Simplify user creation to system user with infrastructure group membership - Update handlers for system scope service management - Remove unnecessary container security options (no user namespace isolation) Container Template Changes: - Pod: Remove --userns args, change WantedBy to multi-user.target - Containers: Replace Annotation with PodmanArgs using dynamic GIDs - Remove /dev/shm mounts and SecurityLabelDisable (not needed for rootful) - Change WantedBy to multi-user.target for system services Documentation Updates: - Add ADR-005: Rootful Containers with Infrastructure Fact Pattern - Update ADR-003: Podman + systemd for system-level deployment - Update authentik-deployment-guide.md for system scope commands - Update service-integration-guide.md with rootful pattern examples - Document discarded rootless approach and rationale Why Rootful Succeeds: - Direct UID/GID mapping preserves supplementary groups - Container process groups match host socket group ownership - No user namespace remapping breaking permissions Why Rootless Failed (Discarded): - User namespace UID/GID remapping broke group-based socket access - Supplementary groups remapped into subgid range didn't match socket ownership - Even with --userns=host and keep_original_groups, permissions failed Pattern Established: - Infrastructure roles create client groups and export GID facts - Application roles validate facts and consume in container templates - Rootful containers run as dedicated users with --group-add for socket access - System-level deployment provides standard systemd service management Deployment Validated: - Services in /system.slice/ ✓ - Process groups: 961 (valkey-clients), 962 (postgres-clients), 966 (authentik) ✓ - Socket permissions: 770 with client groups ✓ - HTTP endpoint responding ✓
This commit is contained in:
@@ -8,6 +8,7 @@ This document records the significant architectural decisions made in the rick-i
|
||||
- [ADR-002: Unix Socket IPC Architecture](#adr-002-unix-socket-ipc-architecture)
|
||||
- [ADR-003: Podman + systemd Container Orchestration](#adr-003-podman--systemd-container-orchestration)
|
||||
- [ADR-004: Forward Authentication Security Model](#adr-004-forward-authentication-security-model)
|
||||
- [ADR-005: Rootful Containers with Infrastructure Fact Pattern](#adr-005-rootful-containers-with-infrastructure-fact-pattern)
|
||||
|
||||
---
|
||||
|
||||
@@ -270,8 +271,9 @@ podman exec authentik-server id
|
||||
|
||||
**Status**: ✅ Accepted
|
||||
**Date**: December 2025
|
||||
**Updated**: December 2025 (System-level deployment pattern)
|
||||
**Deciders**: Infrastructure Team
|
||||
**Technical Story**: Container orchestration solution for rootless, secure application deployment with systemd integration.
|
||||
**Technical Story**: Container orchestration solution for secure application deployment with systemd integration.
|
||||
|
||||
### Context
|
||||
|
||||
@@ -284,40 +286,42 @@ Container orchestration options for a single-node infrastructure:
|
||||
|
||||
### Decision
|
||||
|
||||
We will use **Podman with systemd integration (Quadlet)** for container orchestration.
|
||||
We will use **Podman with systemd integration (Quadlet)** for container orchestration, deployed as system-level services (rootful containers running as dedicated users).
|
||||
|
||||
### Rationale
|
||||
|
||||
#### Security Advantages
|
||||
|
||||
- **Rootless Architecture**: No privileged daemon required
|
||||
- **No Daemon Required**: No privileged daemon attack surface
|
||||
```bash
|
||||
# Docker: Requires root daemon
|
||||
sudo systemctl status docker
|
||||
|
||||
# Podman: Rootless operation
|
||||
systemctl --user status podman
|
||||
# Podman: Daemonless operation
|
||||
podman ps # No daemon needed
|
||||
```
|
||||
- **No Daemon Attack Surface**: No long-running privileged process
|
||||
- **User Namespace Isolation**: Each user's containers are isolated
|
||||
- **Dedicated Service Users**: Containers run as dedicated system users (not root)
|
||||
- **Group-Based Access Control**: Unix group membership controls infrastructure access
|
||||
- **SELinux Integration**: Better SELinux support than Docker
|
||||
|
||||
#### systemd Integration Benefits
|
||||
|
||||
- **Native Service Management**: Containers as systemd services
|
||||
- **Native Service Management**: Containers as system-level systemd services
|
||||
```ini
|
||||
# Quadlet file: ~/.config/containers/systemd/authentik.pod
|
||||
# Quadlet file: /etc/containers/systemd/authentik.pod
|
||||
[Unit]
|
||||
Description=Authentik Authentication Pod
|
||||
|
||||
[Pod]
|
||||
PublishPort=127.0.0.1:9000:9000
|
||||
PublishPort=0.0.0.0:9000:9000
|
||||
ShmSize=256m
|
||||
|
||||
[Service]
|
||||
Restart=always
|
||||
TimeoutStartSec=900
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
- **Dependency Management**: systemd handles service dependencies
|
||||
- **Resource Control**: systemd resource limits and monitoring
|
||||
@@ -327,11 +331,11 @@ We will use **Podman with systemd integration (Quadlet)** for container orchestr
|
||||
|
||||
- **Familiar Tooling**: Standard systemd commands
|
||||
```bash
|
||||
systemctl --user status authentik-pod
|
||||
systemctl --user restart authentik-server
|
||||
journalctl --user -u authentik-server -f
|
||||
systemctl status authentik-pod
|
||||
systemctl restart authentik-server
|
||||
journalctl -u authentik-server -f
|
||||
```
|
||||
- **Boot Integration**: Services start automatically with user sessions
|
||||
- **Boot Integration**: Services start automatically at system boot
|
||||
- **Resource Monitoring**: systemd resource tracking
|
||||
- **Configuration Management**: Declarative Quadlet files
|
||||
|
||||
@@ -345,7 +349,7 @@ We will use **Podman with systemd integration (Quadlet)** for container orchestr
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ systemd User Session (authentik) │
|
||||
│ systemd System Services (/system.slice/) │
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ │
|
||||
│ │ authentik-pod │ │ authentik-server│ │authentik-worker│ │
|
||||
@@ -355,15 +359,23 @@ We will use **Podman with systemd integration (Quadlet)** for container orchestr
|
||||
│ └────────────────────┼────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||
│ │ Podman Pod (rootless) │ │
|
||||
│ │ Podman Pod (rootful, dedicated user) │ │
|
||||
│ │ │ │
|
||||
│ │ ┌─────────────────┐ ┌─────────────────────────────────┐ │ │
|
||||
│ │ │ Server Container│ │ Worker Container │ │ │
|
||||
│ │ │ UID: 963 (host) │ │ UID: 963 (host) │ │ │
|
||||
│ │ │ Groups: postgres│ │ Groups: postgres,valkey │ │ │
|
||||
│ │ │ valkey │ │ │ │ │
|
||||
│ │ │ User: 966:966 │ │ User: 966:966 │ │ │
|
||||
│ │ │ Groups: 961,962 │ │ Groups: 961,962 │ │ │
|
||||
│ │ │ (valkey,postgres)│ │ (valkey,postgres) │ │ │
|
||||
│ │ └─────────────────┘ └─────────────────────────────────┘ │ │
|
||||
│ └─────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
│ Group-based access to infrastructure
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Infrastructure Services │
|
||||
│ PostgreSQL: /var/run/postgresql (postgres:postgres-clients)│
|
||||
│ Valkey: /var/run/valkey (valkey:valkey-clients) │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
@@ -396,19 +408,22 @@ Requires=authentik-pod.service
|
||||
ContainerName=authentik-server
|
||||
Image=ghcr.io/goauthentik/server:2025.10
|
||||
Pod=authentik.pod
|
||||
EnvironmentFile=%h/.env
|
||||
User=%i:%i
|
||||
Annotation=run.oci.keep_original_groups=1
|
||||
EnvironmentFile=/opt/authentik/.env
|
||||
User=966:966
|
||||
PodmanArgs=--group-add 962 --group-add 961
|
||||
|
||||
# Volume mounts for sockets
|
||||
# Volume mounts for sockets and data
|
||||
Volume=/opt/authentik/media:/media
|
||||
Volume=/opt/authentik/data:/data
|
||||
Volume=/var/run/postgresql:/var/run/postgresql:Z
|
||||
Volume=/var/run/valkey:/var/run/valkey:Z
|
||||
|
||||
[Service]
|
||||
Restart=always
|
||||
TimeoutStartSec=300
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
### User Management Strategy
|
||||
@@ -418,20 +433,17 @@ WantedBy=default.target
|
||||
- name: Create service user
|
||||
user:
|
||||
name: authentik
|
||||
group: authentik
|
||||
groups: [postgres-clients, valkey-clients]
|
||||
system: true
|
||||
shell: /bin/bash
|
||||
home: /opt/authentik
|
||||
create_home: true
|
||||
|
||||
- name: Add to infrastructure groups
|
||||
user:
|
||||
name: authentik
|
||||
groups: [postgres, valkey]
|
||||
append: true
|
||||
|
||||
- name: Enable lingering (services persist)
|
||||
command: loginctl enable-linger authentik
|
||||
```
|
||||
|
||||
**Note**: Infrastructure roles (PostgreSQL, Valkey) export client group GIDs as Ansible facts (`postgresql_client_group_gid`, `valkey_client_group_gid`) which are consumed by application container templates for dynamic `--group-add` arguments.
|
||||
|
||||
### Consequences
|
||||
|
||||
#### Positive
|
||||
@@ -457,15 +469,20 @@ WantedBy=default.target
|
||||
### Technical Implementation
|
||||
|
||||
```bash
|
||||
# Container management (as service user)
|
||||
systemctl --user status authentik-pod
|
||||
systemctl --user restart authentik-server
|
||||
# Container management (system scope)
|
||||
systemctl status authentik-pod
|
||||
systemctl restart authentik-server
|
||||
podman ps
|
||||
podman logs authentik-server
|
||||
|
||||
# Resource monitoring
|
||||
systemctl --user show authentik-server --property=MemoryCurrent
|
||||
journalctl --user -u authentik-server -f
|
||||
systemctl show authentik-server --property=MemoryCurrent
|
||||
journalctl -u authentik-server -f
|
||||
|
||||
# Verify container groups
|
||||
ps aux | grep authentik-server | head -1 | awk '{print $2}' | \
|
||||
xargs -I {} cat /proc/{}/status | grep Groups
|
||||
# Output: Groups: 961 962 966
|
||||
```
|
||||
|
||||
### Alternatives Considered
|
||||
@@ -763,6 +780,268 @@ app.jnss.me {
|
||||
|
||||
---
|
||||
|
||||
## ADR-005: Rootful Containers with Infrastructure Fact Pattern
|
||||
|
||||
**Status**: ✅ Accepted
|
||||
**Date**: December 2025
|
||||
**Deciders**: Infrastructure Team
|
||||
**Technical Story**: Enable containerized applications to access native infrastructure services (PostgreSQL, Valkey) via Unix sockets with group-based permissions.
|
||||
|
||||
### Context
|
||||
|
||||
Containerized applications need to access infrastructure services (PostgreSQL, Valkey) through Unix sockets with filesystem-based permission controls. The permission model requires:
|
||||
|
||||
1. **Socket directories** owned by service groups (`postgres-clients`, `valkey-clients`)
|
||||
2. **Application users** added to these groups for access
|
||||
3. **Container processes** must preserve group membership to access sockets
|
||||
|
||||
Two approaches were evaluated:
|
||||
|
||||
1. **Rootless containers (user namespace)**: Containers run in user namespace with UID/GID remapping
|
||||
2. **Rootful containers (system services)**: Containers run as dedicated system users without namespace isolation
|
||||
|
||||
### Decision
|
||||
|
||||
We will use **rootful containers deployed as system-level systemd services** with an **Infrastructure Fact Pattern** where infrastructure roles export client group GIDs as Ansible facts for application consumption.
|
||||
|
||||
### Rationale
|
||||
|
||||
#### Why Rootful Succeeds
|
||||
|
||||
**Direct UID/GID Mapping**:
|
||||
```bash
|
||||
# Host: authentik user UID 966, groups: 966 (authentik), 961 (valkey-clients), 962 (postgres-clients)
|
||||
# Container User=966:966 with PodmanArgs=--group-add 961 --group-add 962
|
||||
|
||||
# Inside container:
|
||||
id
|
||||
# uid=966(authentik) gid=966(authentik) groups=966(authentik),961(valkey-clients),962(postgres-clients)
|
||||
|
||||
# Socket access works:
|
||||
ls -l /var/run/postgresql/.s.PGSQL.5432
|
||||
# srwxrwx--- 1 postgres postgres-clients 0 ... /var/run/postgresql/.s.PGSQL.5432
|
||||
```
|
||||
|
||||
**Group membership preserved**: Container process has GIDs 961 and 962, matching socket group ownership.
|
||||
|
||||
#### Why Rootless Failed (Discarded Approach)
|
||||
|
||||
**User Namespace UID/GID Remapping**:
|
||||
```bash
|
||||
# Host: authentik user UID 100000, subuid range 200000-265535
|
||||
# Container User=%i:%i with --userns=host --group-add=keep-groups
|
||||
|
||||
# User namespace remaps:
|
||||
# Host UID 100000 → Container UID 100000 (root in namespace)
|
||||
# Host GID 961 → Container GID 200961 (remapped into subgid range)
|
||||
# Host GID 962 → Container GID 200962 (remapped into subgid range)
|
||||
|
||||
# Socket ownership on host:
|
||||
# srwxrwx--- 1 postgres postgres-clients (GID 962)
|
||||
|
||||
# Container process groups: 200961, 200962 (remapped)
|
||||
# Socket expects: GID 962 (not remapped)
|
||||
# Result: Permission denied ❌
|
||||
```
|
||||
|
||||
**Root cause**: User namespace supplementary group remapping breaks group-based socket access even with `--userns=host`, `--group-add=keep-groups`, and `Annotation=run.oci.keep_original_groups=1`.
|
||||
|
||||
### Infrastructure Fact Pattern
|
||||
|
||||
#### Infrastructure Roles Export GIDs
|
||||
|
||||
Infrastructure services create client groups and export their GIDs as Ansible facts:
|
||||
|
||||
```yaml
|
||||
# PostgreSQL role: roles/postgresql/tasks/main.yml
|
||||
- name: Create PostgreSQL client access group
|
||||
group:
|
||||
name: postgres-clients
|
||||
system: true
|
||||
|
||||
- name: Get PostgreSQL client group GID
|
||||
shell: "getent group postgres-clients | cut -d: -f3"
|
||||
register: postgresql_client_group_lookup
|
||||
changed_when: false
|
||||
|
||||
- name: Set PostgreSQL client group GID as fact
|
||||
set_fact:
|
||||
postgresql_client_group_gid: "{{ postgresql_client_group_lookup.stdout }}"
|
||||
```
|
||||
|
||||
```yaml
|
||||
# Valkey role: roles/valkey/tasks/main.yml
|
||||
- name: Create Valkey client access group
|
||||
group:
|
||||
name: valkey-clients
|
||||
system: true
|
||||
|
||||
- name: Get Valkey client group GID
|
||||
shell: "getent group valkey-clients | cut -d: -f3"
|
||||
register: valkey_client_group_lookup
|
||||
changed_when: false
|
||||
|
||||
- name: Set Valkey client group GID as fact
|
||||
set_fact:
|
||||
valkey_client_group_gid: "{{ valkey_client_group_lookup.stdout }}"
|
||||
```
|
||||
|
||||
#### Application Roles Consume Facts
|
||||
|
||||
Application roles validate and consume infrastructure facts:
|
||||
|
||||
```yaml
|
||||
# Authentik role: roles/authentik/tasks/main.yml
|
||||
- name: Validate infrastructure facts are available
|
||||
assert:
|
||||
that:
|
||||
- postgresql_client_group_gid is defined
|
||||
- valkey_client_group_gid is defined
|
||||
fail_msg: |
|
||||
Required infrastructure facts are not available.
|
||||
Ensure PostgreSQL and Valkey roles have run first.
|
||||
|
||||
- name: Create authentik user with infrastructure groups
|
||||
user:
|
||||
name: authentik
|
||||
groups: [postgres-clients, valkey-clients]
|
||||
append: true
|
||||
```
|
||||
|
||||
```ini
|
||||
# Container template: roles/authentik/templates/authentik-server.container
|
||||
[Container]
|
||||
User={{ authentik_uid }}:{{ authentik_gid }}
|
||||
PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }}
|
||||
```
|
||||
|
||||
### Implementation Details
|
||||
|
||||
#### System-Level Deployment
|
||||
|
||||
```ini
|
||||
# Quadlet files deployed to /etc/containers/systemd/ (not ~/.config/)
|
||||
# Pod: /etc/containers/systemd/authentik.pod
|
||||
[Unit]
|
||||
Description=Authentik Authentication Pod
|
||||
|
||||
[Pod]
|
||||
PublishPort=0.0.0.0:9000:9000
|
||||
ShmSize=256m
|
||||
|
||||
[Service]
|
||||
Restart=always
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target # System target, not default.target
|
||||
```
|
||||
|
||||
```ini
|
||||
# Container: /etc/containers/systemd/authentik-server.container
|
||||
[Container]
|
||||
User=966:966
|
||||
PodmanArgs=--group-add 962 --group-add 961
|
||||
|
||||
Volume=/var/run/postgresql:/var/run/postgresql:Z
|
||||
Volume=/var/run/valkey:/var/run/valkey:Z
|
||||
```
|
||||
|
||||
#### Service Management
|
||||
|
||||
```bash
|
||||
# System scope (not user scope)
|
||||
systemctl status authentik-pod
|
||||
systemctl restart authentik-server
|
||||
journalctl -u authentik-server -f
|
||||
|
||||
# Verify container location
|
||||
systemctl status authentik-server | grep CGroup
|
||||
# CGroup: /system.slice/authentik-server.service ✓
|
||||
```
|
||||
|
||||
### Special Case: Valkey Socket Group Fix
|
||||
|
||||
Valkey doesn't natively support socket group configuration (unlike PostgreSQL's `unix_socket_group`). A helper service ensures correct socket permissions:
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/valkey-socket-fix.service
|
||||
[Unit]
|
||||
Description=Fix Valkey socket group ownership and permissions
|
||||
BindsTo=valkey.service
|
||||
After=valkey.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/bin/sh -c 'i=0; while [ ! -S /var/run/valkey/valkey.sock ] && [ $i -lt 100 ]; do sleep 0.1; i=$((i+1)); done'
|
||||
ExecStart=/bin/chgrp valkey-clients /var/run/valkey/valkey.sock
|
||||
ExecStart=/bin/chmod 770 /var/run/valkey/valkey.sock
|
||||
RemainAfterExit=yes
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Triggered by Valkey service:
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/valkey.service (excerpt)
|
||||
[Unit]
|
||||
Wants=valkey-socket-fix.service
|
||||
```
|
||||
|
||||
### Consequences
|
||||
|
||||
#### Positive
|
||||
|
||||
- **Socket Access Works**: Group-based permissions function correctly
|
||||
- **Security**: Containers run as dedicated users (not root), no privileged daemon
|
||||
- **Portability**: Dynamic GID facts work across different hosts
|
||||
- **Consistency**: Same pattern for all containerized applications
|
||||
- **Simplicity**: No user namespace complexity, standard systemd service management
|
||||
|
||||
#### Negative
|
||||
|
||||
- **Not "Pure" Rootless**: Containers require root for systemd service deployment
|
||||
- **Different from Docker**: Less familiar pattern than rootless user services
|
||||
|
||||
#### Neutral
|
||||
|
||||
- **System vs User Scope**: Different commands (`systemctl` vs `systemctl --user`) but equally capable
|
||||
- **Deployment Location**: `/etc/containers/systemd/` vs `~/.config/` but same Quadlet functionality
|
||||
|
||||
### Validation
|
||||
|
||||
```bash
|
||||
# Verify service location
|
||||
systemctl status authentik-server | grep CGroup
|
||||
# → /system.slice/authentik-server.service ✓
|
||||
|
||||
# Verify process groups
|
||||
ps aux | grep authentik | head -1 | awk '{print $2}' | \
|
||||
xargs -I {} cat /proc/{}/status | grep Groups
|
||||
# → Groups: 961 962 966 ✓
|
||||
|
||||
# Verify socket permissions
|
||||
ls -l /var/run/postgresql/.s.PGSQL.5432
|
||||
# → srwxrwx--- postgres postgres-clients ✓
|
||||
|
||||
ls -l /var/run/valkey/valkey.sock
|
||||
# → srwxrwx--- valkey valkey-clients ✓
|
||||
|
||||
# Verify HTTP endpoint
|
||||
curl -I http://127.0.0.1:9000/
|
||||
# → HTTP/1.1 302 Found ✓
|
||||
```
|
||||
|
||||
### Alternatives Considered
|
||||
|
||||
1. **Rootless with user namespace** - Discarded due to GID remapping breaking group-based socket access
|
||||
2. **TCP-only connections** - Rejected to maintain Unix socket security and performance benefits
|
||||
3. **Hardcoded GIDs** - Rejected for portability; facts provide dynamic resolution
|
||||
4. **Directory permissions (777)** - Rejected for security; group-based access more restrictive
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
These architecture decisions collectively create a robust, secure, and performant infrastructure:
|
||||
@@ -771,6 +1050,7 @@ These architecture decisions collectively create a robust, secure, and performan
|
||||
- **Unix Sockets** eliminate network attack vectors
|
||||
- **Podman + systemd** delivers secure container orchestration
|
||||
- **Forward Authentication** enables centralized security without application changes
|
||||
- **Rootful Container Pattern** enables group-based socket access with infrastructure fact sharing
|
||||
|
||||
The combination results in an infrastructure that prioritizes security and performance while maintaining operational simplicity and reliability.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user