- Enable IP forwarding in security playbook (net.ipv4.ip_forward = 1) - Add podman network firewall rules to fix container DNS/HTTPS access - Implement systemd timer for reliable Nextcloud background job execution - Add database optimization tasks (indices, bigint conversion, mimetypes) - Configure maintenance window (04:00 UTC) and phone region (NO) - Add security headers (X-Robots-Tag, X-Permitted-Cross-Domain-Policies) - Create Nextcloud removal playbook for clean uninstall - Fix nftables interface matching (podman0 vs podman+) Root cause: nftables FORWARD chain blocked container egress traffic Solution: Explicit firewall rules for podman0 bridge interface
Nextcloud Cloud Storage Role
Self-contained Nextcloud deployment using Podman Quadlet with FPM, PostgreSQL database, and Valkey cache via Unix sockets.
Features
- Container: Single Nextcloud FPM container via Podman Quadlet
- Database: Self-managed PostgreSQL database via Unix socket
- Cache: Valkey (Redis-compatible) for file locking and caching
- Web Server: Caddy reverse proxy with FastCGI and automatic HTTPS
- Security: Group-based socket access, separated data/config volumes
- Size: ~320MB FPM image (vs 1.1GB Apache variant)
Architecture
Internet → Caddy (HTTPS:443) → FastCGI → Nextcloud FPM Container (127.0.0.1:9000)
↓ ↓
Serves static files PostgreSQL (socket)
from /opt/nextcloud/html Valkey (socket)
Volume Layout
/opt/nextcloud/
├── html/ # Application code (755 - readable by Caddy for static files)
├── data/ # User files (700 - private to container)
├── config/ # Config with secrets (700 - private to container)
├── custom_apps/ # Installed apps (755 - readable)
└── .env # Environment variables (600)
Security Model:
- Caddy serves static assets (CSS/JS/images) directly from
/opt/nextcloud/html - Caddy cannot access
/dataor/config(mode 700) - User files are only served through authenticated PHP requests via FPM
Dependencies
postgresqlrole (infrastructure)valkeyrole (infrastructure)caddyrole (web server)podmanrole (container runtime)
Variables
See defaults/main.yml for all configurable variables.
Required Vault Variables
Define these in your host_vars/ with ansible-vault:
vault_nextcloud_db_password: "secure-database-password"
vault_nextcloud_admin_password: "secure-admin-password"
vault_valkey_password: "secure-valkey-password"
Key Variables
# Domain
nextcloud_domain: "cloud.jnss.me"
# Admin user
nextcloud_admin_user: "admin"
# Database
nextcloud_db_name: "nextcloud"
nextcloud_db_user: "nextcloud"
# Cache (use different DB number per service)
nextcloud_valkey_db: 2 # Authentik uses 1
# PHP limits
nextcloud_php_memory_limit: "512M"
nextcloud_php_upload_limit: "512M"
Deployment Strategy
This role uses a two-phase deployment approach to work correctly with the Nextcloud container's initialization process:
Phase 1: Container Initialization (automatic)
- Create empty directories for volumes
- Deploy environment configuration (
.env) - Start Nextcloud container
- Container entrypoint detects first-time setup (no
version.php) - Container copies Nextcloud files to
/var/www/html/ - Container runs
occ maintenance:installwith PostgreSQL - Installation creates
config.phpwith database credentials
Phase 2: Custom Configuration (automatic)
- Ansible waits for
occ statusto reportinstalled: true - Ansible deploys custom
redis.config.php(overwrites default) - Container restart applies custom configuration
Why this order?
The Nextcloud container's entrypoint uses version.php as a marker to determine if installation is needed. If you deploy any files into /opt/nextcloud/config/ before the container starts, the initialization process fails:
- Container copies files including
version.php - Entrypoint sees
version.phpexists → assumes already installed - Skips running
occ maintenance:install - Result: Empty
config.php, 503 errors
By deploying custom configs after installation completes, we:
- ✅ Allow the container's auto-installation to run properly
- ✅ Override specific configs (like Redis) after the fact
- ✅ Maintain idempotency (subsequent runs just update configs)
See the official Nextcloud Docker documentation for more details on the auto-configuration process.
Usage
Include in Playbook
- role: nextcloud
tags: ['nextcloud', 'cloud', 'storage']
Deploy
# Deploy Nextcloud role
ansible-playbook -i inventory/hosts.yml site.yml --tags nextcloud --ask-vault-pass
# Deploy only infrastructure dependencies
ansible-playbook -i inventory/hosts.yml site.yml --tags postgresql,valkey,caddy
Verification
After deployment:
-
Access Nextcloud:
https://cloud.jnss.me -
Check service status:
ssh root@arch-vps systemctl status nextcloud podman ps | grep nextcloud -
View logs:
# Container logs journalctl -u nextcloud -f podman logs nextcloud # Caddy logs tail -f /var/log/caddy/nextcloud.log -
Verify socket access:
# Check group memberships id nextcloud # Should show: postgres-clients, valkey-clients # Check socket permissions ls -la /var/run/postgresql/.s.PGSQL.5432 ls -la /var/run/valkey/valkey.sock
Maintenance
OCC Command (Nextcloud CLI)
Run Nextcloud's OCC command-line tool:
# General syntax
podman exec --user www-data nextcloud php occ <command>
# Examples
podman exec --user www-data nextcloud php occ status
podman exec --user www-data nextcloud php occ app:list
podman exec --user www-data nextcloud php occ maintenance:mode --on
podman exec --user www-data nextcloud php occ files:scan --all
Update Nextcloud
The container automatically updates on restart:
systemctl restart nextcloud
Or pull specific version:
# In host_vars or defaults
nextcloud_version: "32-fpm" # Pin to major version
# Or
nextcloud_version: "32.0.3-fpm" # Pin to exact version
Backup Strategy
Key directories to backup:
- User data:
/opt/nextcloud/data - Configuration:
/opt/nextcloud/config - Database: PostgreSQL
nextclouddatabase - Custom apps:
/opt/nextcloud/custom_apps(optional)
Example backup script:
#!/bin/bash
# Enable maintenance mode
podman exec --user www-data nextcloud php occ maintenance:mode --on
# Backup data and config
tar -czf nextcloud-data-$(date +%Y%m%d).tar.gz /opt/nextcloud/data /opt/nextcloud/config
# Backup database
sudo -u postgres pg_dump nextcloud > nextcloud-db-$(date +%Y%m%d).sql
# Disable maintenance mode
podman exec --user www-data nextcloud php occ maintenance:mode --off
Performance Tuning
Adjust PHP limits in host_vars:
nextcloud_php_memory_limit: "1G" # For large files
nextcloud_php_upload_limit: "10G" # For large uploads
Redis/Valkey Caching Architecture
This role uses a split caching strategy for optimal performance and stability:
PHP Sessions: File-based (default PHP session handler)
- Location:
/var/www/html/data/sessions/ - Why: Redis session locking can cause cascading failures under high concurrency
- Performance: Excellent for single-server deployments
Nextcloud Application Cache: Redis/Valkey
memcache.local: APCu (in-memory opcode cache)memcache.distributed: Redis (shared cache, file locking)memcache.locking: Redis (transactional file locking)- Configuration: Via custom
redis.config.phptemplate
Why not Redis sessions?
The official Nextcloud Docker image enables Redis session handling when REDIS_HOST is set. However, this can cause severe performance issues:
- Session lock contention: Multiple parallel requests (browser loading CSS/JS/images) compete for the same session lock
- Infinite retries: Default
lock_retries = -1means workers block forever - Timeout orphaning: When reverse proxy times out, FPM workers keep running and hold locks
- Worker exhaustion: Limited FPM workers (default 5) all become blocked
- Cascading failure: New requests queue, timeouts accumulate, locks orphan
This role disables Redis sessions by not setting REDIS_HOST in the environment, while still providing Redis caching via a custom redis.config.php that is deployed independently.
If you need Redis sessions (e.g., multi-server setup with session sharing), you must:
- Enable
REDIS_HOSTinnextcloud.env.j2 - Add a custom PHP ini file with proper lock parameters:
redis.session.lock_expire = 30(locks expire after 30 seconds)redis.session.lock_retries = 100(max 100 retries, not infinite)redis.session.lock_wait_time = 50000(50ms between retries)
- Mount the ini file with
zz-prefix to load after the entrypoint's redis-session.ini - Increase FPM workers significantly (15-20+)
- Monitor for orphaned session locks
Troubleshooting
Container won't start
# Check container logs
journalctl -u nextcloud -n 50
podman logs nextcloud
# Check systemd unit
systemctl status nextcloud
Permission errors
# Verify user groups
id nextcloud
# Should be in: postgres-clients, valkey-clients
# If not, re-run user.yml tasks:
ansible-playbook -i inventory/hosts.yml site.yml --tags nextcloud,user
Database connection errors
# Test PostgreSQL socket
sudo -u nextcloud psql -h /var/run/postgresql -U nextcloud -d nextcloud
# Check socket exists and permissions
ls -la /var/run/postgresql/.s.PGSQL.5432
Caddy FastCGI errors
# Check Caddy can read app files
sudo -u caddy ls -la /opt/nextcloud/html
# Verify FPM is listening
ss -tlnp | grep 9000
# Test FPM connection
curl -v http://127.0.0.1:9000
"Trusted domain" errors
Add domains to nextcloud_trusted_domains:
nextcloud_trusted_domains: "cloud.jnss.me localhost 69.62.119.31"
Or add via OCC:
podman exec --user www-data nextcloud php occ config:system:set trusted_domains 1 --value=cloud.jnss.me
Integration with Authentik SSO
To integrate Nextcloud with Authentik for SSO, see the Authentik documentation for OAuth2/OIDC provider setup.
Security Notes
- User data (
/opt/nextcloud/data) is mode 700 - only container can access - Config (
/opt/nextcloud/config) is mode 700 - contains database passwords - Application files (
/opt/nextcloud/html) are mode 755 - Caddy can read for static files - All traffic is HTTPS via Caddy with automatic Let's Encrypt certificates
- Database and cache connections use Unix sockets (no TCP exposure)
- Container runs as root initially, then switches to www-data (UID 33) for PHP-FPM
Socket Access Pattern
Nextcloud uses a different access pattern than other rick-infra services due to how the official Nextcloud container works:
How it works:
- Container starts as root (UID 0)
- Entrypoint runs as root to write PHP configuration files
- Entrypoint switches to www-data (UID 33) for PHP-FPM process
- www-data accesses PostgreSQL and Valkey via Unix sockets
Why 777 socket permissions are needed:
- The Nextcloud container cannot use
--group-addeffectively because:--group-addonly adds groups to the initial user (root)- When the container switches from root to www-data, supplementary groups are lost
- www-data (UID 33, GID 33) ends up with no access to group-restricted sockets
- Infrastructure sockets use mode 777 to allow access by any UID
- Security is maintained via password authentication (PostgreSQL: scram-sha-256, Valkey: requirepass)
- Sockets are local-only (not network-exposed)
Alternative (TCP): If you prefer group-based socket access (770), you can configure PostgreSQL and Valkey to use TCP instead:
# In host_vars
postgresql_listen_addresses: "127.0.0.1"
postgresql_unix_socket_permissions: "0770" # Restrict to group
valkey_bind: "127.0.0.1"
valkey_port: 6379
valkey_unix_socket_enabled: false
# In Nextcloud env
POSTGRES_HOST=127.0.0.1
POSTGRES_PORT=5432
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
This provides the same security level (password-authenticated, localhost-only) but uses TCP instead of Unix sockets. The trade-off is slightly lower performance compared to Unix sockets.
See infrastructure role documentation (PostgreSQL and Valkey READMEs) for more details on this architectural decision.