Add Nextcloud cloud storage role with split Redis caching strategy
## New Features - **Nextcloud Role**: Complete cloud storage deployment using Podman Quadlet - FPM variant with Caddy reverse proxy and FastCGI - PostgreSQL database via Unix socket - Valkey/Redis for app-level caching and file locking - Automatic HTTPS with Let's Encrypt via Caddy - Dual-root pattern: Caddy serves static assets, FPM handles PHP - **Split Caching Strategy**: Redis caching WITHOUT Redis sessions - Custom redis.config.php template for app-level caching only - File-based PHP sessions for stability (avoids session lock issues) - Prevents cascading failures from session lock contention - Documented in role README with detailed rationale ## Infrastructure Updates - **Socket Permissions**: Update PostgreSQL and Valkey to mode 777 - Required for containers that switch users (root → www-data) - Nextcloud container loses supplementary groups on user switch - Security maintained via password authentication (scram-sha-256, requirepass) - Documented socket permission architecture in docs/ - **PostgreSQL**: Export client group GID as fact for dependent roles - **Valkey**: Export client group GID as fact, update socket fix service ## Documentation - New: docs/socket-permissions-architecture.md - Explains 777 vs 770 socket permission trade-offs - Documents why group-based access doesn't work for user-switching containers - Provides TCP alternative for stricter security requirements - Updated: All role READMEs with socket permission notes - New: Nextcloud README with comprehensive deployment, troubleshooting, and Redis architecture documentation ## Configuration - host_vars: Add Nextcloud vault variables and configuration - site.yml: Include Nextcloud role in main playbook ## Technical Details **Why disable Redis sessions?** The official Nextcloud container enables Redis session handling via REDIS_HOST env var, which causes severe performance issues: 1. Session lock contention under high concurrency (browser parallel asset requests) 2. Infinite lock retries (default lock_retries=-1) blocking FPM workers 3. Timeout orphaning: reverse proxy kills connection, worker keeps lock 4. Worker pool exhaustion: all 5 default workers blocked on same session lock 5. Cascading failure: new requests queue, more timeouts, more orphaned locks Solution: Use file-based sessions (reliable, fast for single-server) while keeping Redis for distributed cache and transactional file locking via custom config file. This provides optimal performance without the complexity of Redis session debugging. Tested: Fresh deployment on arch-vps (69.62.119.31) Domain: https://cloud.jnss.me/
This commit is contained in:
16
roles/nextcloud/templates/redis-session-override.ini.j2
Normal file
16
roles/nextcloud/templates/redis-session-override.ini.j2
Normal file
@@ -0,0 +1,16 @@
|
||||
; Redis Session Lock Override for Nextcloud
|
||||
; Prevents orphaned session locks from causing infinite hangs
|
||||
;
|
||||
; Default Nextcloud container settings:
|
||||
; redis.session.lock_expire = 0 (locks NEVER expire - causes infinite hangs)
|
||||
; redis.session.lock_retries = -1 (infinite retries - causes worker exhaustion)
|
||||
; redis.session.lock_wait_time = 10000 (10 seconds per retry - very slow)
|
||||
;
|
||||
; These settings ensure locks auto-expire and failed requests don't block workers forever:
|
||||
; - Locks expire after 30 seconds (prevents orphaned locks)
|
||||
; - Max 100 retries = 5 seconds total wait time (prevents infinite loops)
|
||||
; - 50ms wait between retries (reasonable balance)
|
||||
|
||||
redis.session.lock_expire = 30
|
||||
redis.session.lock_retries = 100
|
||||
redis.session.lock_wait_time = 50000
|
||||
Reference in New Issue
Block a user