Migrate to rootful container architecture with infrastructure fact pattern
Major architectural change from rootless user services to system-level (rootful) containers to enable group-based Unix socket access for containerized applications. Infrastructure Changes: - PostgreSQL: Export postgres-clients group GID as Ansible fact - Valkey: Export valkey-clients group GID as Ansible fact - Valkey: Add socket-fix service to maintain correct socket group ownership - Both: Set socket directories to 770 with client group ownership Authentik Role Refactoring: - Remove rootless container configuration (subuid/subgid, lingering, user systemd) - Deploy Quadlet files to /etc/containers/systemd/ (system-level) - Use dynamic GID facts in container PodmanArgs (--group-add) - Simplify user creation to system user with infrastructure group membership - Update handlers for system scope service management - Remove unnecessary container security options (no user namespace isolation) Container Template Changes: - Pod: Remove --userns args, change WantedBy to multi-user.target - Containers: Replace Annotation with PodmanArgs using dynamic GIDs - Remove /dev/shm mounts and SecurityLabelDisable (not needed for rootful) - Change WantedBy to multi-user.target for system services Documentation Updates: - Add ADR-005: Rootful Containers with Infrastructure Fact Pattern - Update ADR-003: Podman + systemd for system-level deployment - Update authentik-deployment-guide.md for system scope commands - Update service-integration-guide.md with rootful pattern examples - Document discarded rootless approach and rationale Why Rootful Succeeds: - Direct UID/GID mapping preserves supplementary groups - Container process groups match host socket group ownership - No user namespace remapping breaking permissions Why Rootless Failed (Discarded): - User namespace UID/GID remapping broke group-based socket access - Supplementary groups remapped into subgid range didn't match socket ownership - Even with --userns=host and keep_original_groups, permissions failed Pattern Established: - Infrastructure roles create client groups and export GID facts - Application roles validate facts and consume in container templates - Rootful containers run as dedicated users with --group-add for socket access - System-level deployment provides standard systemd service management Deployment Validated: - Services in /system.slice/ ✓ - Process groups: 961 (valkey-clients), 962 (postgres-clients), 966 (authentik) ✓ - Socket permissions: 770 with client groups ✓ - HTTP endpoint responding ✓
This commit is contained in:
@@ -6,20 +6,24 @@ This guide explains how to add new containerized services to rick-infra with Pos
|
||||
|
||||
Rick-infra provides a standardized approach for containerized services to access infrastructure services through Unix sockets, maintaining security while providing optimal performance.
|
||||
|
||||
**Architecture**: Services are deployed as **system-level (rootful) containers** running as dedicated users with group-based access to infrastructure sockets. Infrastructure roles (PostgreSQL, Valkey) export client group GIDs as Ansible facts, which application roles consume for dynamic container configuration.
|
||||
|
||||
**Note**: A previous rootless approach was evaluated but discarded due to user namespace UID/GID remapping breaking group-based socket permissions. See [ADR-005](architecture-decisions.md#adr-005-rootful-containers-with-infrastructure-fact-pattern) for details.
|
||||
|
||||
## Architecture Pattern
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Application Service (Podman Container) │
|
||||
│ systemd System Service (/system.slice/) │
|
||||
│ │
|
||||
│ ┌─────────────────┐ │
|
||||
│ │ Your Container │ │
|
||||
│ │ UID: service │ (host user namespace) │
|
||||
│ │ Groups: service,│ │
|
||||
│ │ postgres, │ (supplementary groups preserved) │
|
||||
│ │ valkey │ │
|
||||
│ │ User: UID:GID │ (dedicated system user) │
|
||||
│ │ Groups: GID, │ │
|
||||
│ │ 961,962 │ (postgres-clients, valkey-clients) │
|
||||
│ └─────────────────┘ │
|
||||
│ │ │
|
||||
│ │ PodmanArgs=--group-add 962 --group-add 961 │
|
||||
│ └─────────────────────┐ │
|
||||
└─────────────────────────────────│───────────────────────────┘
|
||||
│
|
||||
@@ -28,38 +32,42 @@ Rick-infra provides a standardized approach for containerized services to access
|
||||
│ │
|
||||
│ PostgreSQL Unix Socket │
|
||||
│ /var/run/postgresql/ │
|
||||
│ Owner: postgres:postgres- │
|
||||
│ clients (GID 962) │
|
||||
│ │
|
||||
│ Valkey Unix Socket │
|
||||
│ /var/run/valkey/ │
|
||||
│ /var/run/valkey/ │
|
||||
│ Owner: valkey:valkey-clients │
|
||||
│ (GID 961) │
|
||||
└──────────────────────────────┘
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Your service must be deployed as:
|
||||
1. **Systemd user service** (via Quadlet)
|
||||
2. **Dedicated system user**
|
||||
3. **Podman container** (rootless)
|
||||
1. **System-level systemd service** (via Quadlet)
|
||||
2. **Dedicated system user** with infrastructure group membership
|
||||
3. **Podman container** (rootful, running as dedicated user)
|
||||
|
||||
## Step 1: User Setup
|
||||
|
||||
Create a dedicated system user for your service and add it to infrastructure groups:
|
||||
|
||||
```yaml
|
||||
- name: Create service group
|
||||
group:
|
||||
name: myservice
|
||||
system: true
|
||||
|
||||
- name: Create service user
|
||||
user:
|
||||
name: myservice
|
||||
group: myservice
|
||||
groups: [postgres-clients, valkey-clients]
|
||||
system: true
|
||||
shell: /bin/false
|
||||
shell: /bin/bash
|
||||
home: /opt/myservice
|
||||
create_home: true
|
||||
|
||||
- name: Add service user to infrastructure groups
|
||||
user:
|
||||
name: myservice
|
||||
groups:
|
||||
- postgres # For PostgreSQL access
|
||||
- valkey # For Valkey/Redis access
|
||||
append: true
|
||||
```
|
||||
|
||||
@@ -72,33 +80,37 @@ Create a dedicated system user for your service and add it to infrastructure gro
|
||||
Description=My Service Pod
|
||||
|
||||
[Pod]
|
||||
PublishPort=127.0.0.1:8080:8080
|
||||
PodmanArgs=--userns=host
|
||||
PublishPort=0.0.0.0:8080:8080
|
||||
ShmSize=256m
|
||||
|
||||
[Service]
|
||||
Restart=always
|
||||
TimeoutStartSec=900
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
- `--userns=host` preserves host user namespace
|
||||
- Standard port publishing for network access
|
||||
- No user namespace arguments needed (rootful containers)
|
||||
- `WantedBy=multi-user.target` for system-level services
|
||||
- `ShmSize` for shared memory if needed by application
|
||||
|
||||
### Container Configuration (`myservice.container`)
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=My Service Container
|
||||
After=myservice-pod.service
|
||||
Requires=myservice-pod.service
|
||||
|
||||
[Container]
|
||||
ContainerName=myservice
|
||||
Image=my-service:latest
|
||||
Pod=myservice.pod
|
||||
EnvironmentFile=/opt/myservice/.env
|
||||
User={{ service_uid }}:{{ service_gid }}
|
||||
Annotation=run.oci.keep_original_groups=1
|
||||
PodmanArgs=--group-add {{ postgresql_client_group_gid }} --group-add {{ valkey_client_group_gid }}
|
||||
|
||||
# Volume mounts for sockets
|
||||
Volume=/var/run/postgresql:/var/run/postgresql:Z
|
||||
@@ -112,15 +124,19 @@ Exec=my-service
|
||||
|
||||
[Service]
|
||||
Restart=always
|
||||
TimeoutStartSec=300
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
- `Annotation=run.oci.keep_original_groups=1` preserves supplementary groups
|
||||
- `PodmanArgs=--group-add` uses dynamic GID facts from infrastructure roles
|
||||
- Mount socket directories with `:Z` for SELinux relabeling
|
||||
- Use host UID/GID for the service user
|
||||
- `WantedBy=multi-user.target` for system-level services
|
||||
|
||||
**Note**: The `postgresql_client_group_gid` and `valkey_client_group_gid` facts are exported by infrastructure roles and consumed in container templates.
|
||||
|
||||
## Step 3: Service Configuration
|
||||
|
||||
@@ -158,7 +174,25 @@ REDIS_HOST=unix:///var/run/valkey/valkey.sock
|
||||
REDIS_DB=2
|
||||
```
|
||||
|
||||
## Step 4: Database Setup
|
||||
## Step 4: Infrastructure Fact Validation
|
||||
|
||||
Before deploying containers, validate that infrastructure facts are available:
|
||||
|
||||
```yaml
|
||||
- name: Validate infrastructure facts are available
|
||||
assert:
|
||||
that:
|
||||
- postgresql_client_group_gid is defined
|
||||
- valkey_client_group_gid is defined
|
||||
fail_msg: |
|
||||
Required infrastructure facts are not available.
|
||||
Ensure PostgreSQL and Valkey roles have run and exported client group GIDs.
|
||||
tags: [validation]
|
||||
```
|
||||
|
||||
**Why this matters**: Container templates use these facts for `--group-add` arguments. If facts are missing, containers will deploy with incorrect group membership and socket access will fail.
|
||||
|
||||
## Step 5: Database Setup
|
||||
|
||||
Add database setup tasks to your role:
|
||||
|
||||
@@ -189,7 +223,7 @@ Add database setup tasks to your role:
|
||||
become_user: postgres
|
||||
```
|
||||
|
||||
## Step 5: Service Role Template
|
||||
## Step 6: Service Role Template
|
||||
|
||||
Create an Ansible role using this pattern:
|
||||
|
||||
@@ -237,19 +271,22 @@ If you get permission denied errors:
|
||||
1. **Check group membership**:
|
||||
```bash
|
||||
groups myservice
|
||||
# Should show: myservice postgres valkey
|
||||
# Should show: myservice postgres-clients valkey-clients
|
||||
```
|
||||
|
||||
2. **Verify container annotations**:
|
||||
2. **Verify container process groups**:
|
||||
```bash
|
||||
podman inspect myservice --format='{{.Config.Annotations}}'
|
||||
# Should include: run.oci.keep_original_groups=1
|
||||
ps aux | grep myservice | head -1 | awk '{print $2}' | \
|
||||
xargs -I {} cat /proc/{}/status | grep Groups
|
||||
# Should show GIDs matching infrastructure client groups
|
||||
```
|
||||
|
||||
3. **Check socket permissions**:
|
||||
```bash
|
||||
ls -la /var/run/postgresql/
|
||||
# drwxrwx--- postgres postgres-clients
|
||||
ls -la /var/run/valkey/
|
||||
# drwxrwx--- valkey valkey-clients
|
||||
```
|
||||
|
||||
### Connection Issues
|
||||
@@ -283,23 +320,28 @@ If you get permission denied errors:
|
||||
|
||||
1. **Security**:
|
||||
- Use dedicated system users for each service
|
||||
- Limit group memberships to required infrastructure
|
||||
- Add users to infrastructure client groups (`postgres-clients`, `valkey-clients`)
|
||||
- Use vault variables for secrets
|
||||
- Deploy Quadlet files to `/etc/containers/systemd/` (system-level)
|
||||
|
||||
2. **Configuration**:
|
||||
- Use single URL format for Redis connections
|
||||
- Mount socket directories with appropriate SELinux labels
|
||||
- Include `run.oci.keep_original_groups=1` annotation
|
||||
- Mount socket directories with appropriate SELinux labels (`:Z`)
|
||||
- Use dynamic GID facts from infrastructure roles in `PodmanArgs=--group-add`
|
||||
- Set `WantedBy=multi-user.target` in Quadlet files
|
||||
|
||||
3. **Deployment**:
|
||||
- Ensure infrastructure roles run first to export GID facts
|
||||
- Validate facts are defined before container deployment
|
||||
- Test socket access before container deployment
|
||||
- Use proper dependency ordering in playbooks
|
||||
- Include database and cache setup tasks
|
||||
|
||||
4. **Monitoring**:
|
||||
- Monitor socket file permissions
|
||||
- Monitor socket file permissions (should be 770 with client group)
|
||||
- Check service logs for connection errors
|
||||
- Verify group memberships after user changes
|
||||
- Verify container process has correct supplementary groups
|
||||
- Verify services are in `/system.slice/`
|
||||
|
||||
## Authentication Integration with Authentik
|
||||
|
||||
|
||||
Reference in New Issue
Block a user