Files
rick-infra/roles/valkey/README.md
Joakim 4f8da38ca6 Add Nextcloud cloud storage role with split Redis caching strategy
## New Features

- **Nextcloud Role**: Complete cloud storage deployment using Podman Quadlet
  - FPM variant with Caddy reverse proxy and FastCGI
  - PostgreSQL database via Unix socket
  - Valkey/Redis for app-level caching and file locking
  - Automatic HTTPS with Let's Encrypt via Caddy
  - Dual-root pattern: Caddy serves static assets, FPM handles PHP

- **Split Caching Strategy**: Redis caching WITHOUT Redis sessions
  - Custom redis.config.php template for app-level caching only
  - File-based PHP sessions for stability (avoids session lock issues)
  - Prevents cascading failures from session lock contention
  - Documented in role README with detailed rationale

## Infrastructure Updates

- **Socket Permissions**: Update PostgreSQL and Valkey to mode 777
  - Required for containers that switch users (root → www-data)
  - Nextcloud container loses supplementary groups on user switch
  - Security maintained via password authentication (scram-sha-256, requirepass)
  - Documented socket permission architecture in docs/

- **PostgreSQL**: Export client group GID as fact for dependent roles
- **Valkey**: Export client group GID as fact, update socket fix service

## Documentation

- New: docs/socket-permissions-architecture.md
  - Explains 777 vs 770 socket permission trade-offs
  - Documents why group-based access doesn't work for user-switching containers
  - Provides TCP alternative for stricter security requirements

- Updated: All role READMEs with socket permission notes
- New: Nextcloud README with comprehensive deployment, troubleshooting, and Redis architecture documentation

## Configuration

- host_vars: Add Nextcloud vault variables and configuration
- site.yml: Include Nextcloud role in main playbook

## Technical Details

**Why disable Redis sessions?**

The official Nextcloud container enables Redis session handling via REDIS_HOST env var,
which causes severe performance issues:

1. Session lock contention under high concurrency (browser parallel asset requests)
2. Infinite lock retries (default lock_retries=-1) blocking FPM workers
3. Timeout orphaning: reverse proxy kills connection, worker keeps lock
4. Worker pool exhaustion: all 5 default workers blocked on same session lock
5. Cascading failure: new requests queue, more timeouts, more orphaned locks

Solution: Use file-based sessions (reliable, fast for single-server) while keeping
Redis for distributed cache and transactional file locking via custom config file.

This provides optimal performance without the complexity of Redis session debugging.

Tested: Fresh deployment on arch-vps (69.62.119.31)
Domain: https://cloud.jnss.me/
2025-12-14 22:07:08 +01:00

196 lines
6.5 KiB
Markdown

# Valkey Infrastructure Role
This role provides Valkey as shared infrastructure for the rick-infra project, following the same patterns established by the PostgreSQL role.
## Overview
**Valkey** is a high-performance data structure store used as a database, cache, and message broker. It's a Redis fork that maintains **100% Redis compatibility** while providing additional features and improvements.
Valkey is deployed as a host-level service that multiple applications can use for caching, sessions, and data storage. Each application configures its own Valkey database number and connection parameters.
## Why Valkey?
- **Redis-compatible**: Drop-in replacement for Redis with identical API
- **Open source**: Truly open source alternative to Redis
- **Performance**: Enhanced performance optimizations
- **Arch Linux default**: Arch Linux provides Valkey instead of Redis in the `redis` package
- **Future-proof**: Active development and community support
## Features
- **Security-focused**: Localhost-only binding, password authentication, disabled dangerous commands
- **Systemd integration**: Native systemd service management with security hardening
- **Multi-application support**: 16 databases available for different services
- **Performance optimized**: Conservative memory limits and persistence settings
- **Infrastructure pattern**: Matches PostgreSQL role architecture
- **Redis compatibility**: Applications can use standard Redis clients and commands
## Database Allocation
Applications should use different Valkey database numbers:
- **Database 0**: Reserved for system/testing use
- **Database 1**: Authentik (sessions, cache)
- **Database 2**: Nextcloud (sessions, file locking, cache)
- **Database 3+**: Available for additional services
## Configuration
### Required Variables
```yaml
vault_valkey_password: "your-secure-valkey-password"
```
### Optional Overrides
```yaml
# Service management
valkey_service_enabled: true
valkey_service_state: "started"
# Network configuration
valkey_bind: "127.0.0.1"
valkey_port: 6379
# Memory management
valkey_maxmemory: "256mb"
valkey_maxmemory_policy: "allkeys-lru"
# Security hardening
valkey_systemd_security: true
```
## Application Integration
Applications can connect to Valkey using either Valkey-specific or Redis-compatible patterns:
### Valkey Environment Variables (Recommended)
```yaml
VALKEY_HOST: "{{ ansible_default_ipv4.address }}"
VALKEY_PORT: "6379"
VALKEY_PASSWORD: "{{ vault_valkey_password }}"
VALKEY_DB: "1" # Unique database number per application
```
### Redis-Compatible Environment Variables (Also Supported)
```yaml
REDIS_HOST: "{{ ansible_default_ipv4.address }}"
REDIS_PORT: "6379"
REDIS_PASSWORD: "{{ vault_valkey_password }}"
REDIS_DB: "1" # Unique database number per application
```
### Connection Example
```bash
# Using redis-cli (Redis-compatible)
redis-cli -h 127.0.0.1 -p 6379 -a password -n 1
# Using valkey-cli (native Valkey client)
valkey-cli -h 127.0.0.1 -p 6379 -a password -n 1
```
## Redis Compatibility
Valkey maintains **100% Redis compatibility**:
- **Same commands**: All Redis commands work identically
- **Same protocols**: RESP (Redis Serialization Protocol) supported
- **Same client libraries**: All Redis client libraries work without modification
- **Same configuration format**: Configuration syntax identical to Redis
- **Same data types**: All Redis data types supported
## Security
- **Network isolation**: Binds only to localhost (or Unix socket only)
- **Authentication**: Password protection required
- **Command restrictions**: Dangerous commands disabled
- **Systemd hardening**: Full security restrictions applied
- **File permissions**: Restrictive access to configuration and data
### Unix Socket Permissions
**Current Configuration**: Socket permissions are set to `777` (world-readable/writable)
**Rationale**:
- Allows containers running as any UID to access the socket
- Needed for containers that start as root and switch to unprivileged users (e.g., Nextcloud's www-data)
- Security is maintained via password authentication (requirepass)
- Sockets are local-only (not network-exposed)
**Security Considerations**:
- ✅ Any local process can connect to the socket
- ✅ But still requires valid password to authenticate
- ✅ Limited to processes on same host (not network)
- ✅ Password stored securely in vault
**Alternative Approach (TCP)**:
If you prefer more restrictive socket permissions, you can use TCP instead:
```yaml
# In host_vars
valkey_bind: "127.0.0.1" # Use TCP instead of socket
valkey_port: 6379
valkey_unix_socket_enabled: false # Disable Unix socket
valkey_unix_socket_perm: "770" # Restrict socket to group (if enabled)
# In application configs
# Use: host=127.0.0.1 port=6379
# Instead of: host=/var/run/valkey/valkey.sock
```
This provides the same security level (password-authenticated, localhost-only) but uses TCP instead of Unix sockets.
## Dependencies
This is an infrastructure role with no dependencies. Applications that need Valkey should declare this role as a dependency:
```yaml
# roles/your-app/meta/main.yml
dependencies:
- role: valkey
```
## Service Management
```bash
# Service status
sudo systemctl status valkey
# View logs
sudo journalctl -u valkey -f
# Test connectivity
redis-cli -h 127.0.0.1 -p 6379 -a password ping
```
## Monitoring
Valkey status is reported during deployment and can be monitored through:
- **systemctl**: Service health and status
- **journald**: Centralized logging
- **Redis CLI**: Direct connectivity testing using standard Redis tools
- **Application logs**: Connection status from applications
## File Locations
- **Configuration**: `/etc/valkey/valkey.conf`
- **Data directory**: `/var/lib/valkey`
- **Systemd override**: `/etc/systemd/system/valkey.service.d/override.conf`
- **Logs**: `journalctl -u valkey`
## Migration from Redis
If migrating from Redis:
1. **Data compatibility**: Valkey can read existing Redis data files
2. **Configuration**: Most Redis configurations work without changes
3. **Applications**: No application changes required due to protocol compatibility
4. **Monitoring**: Same Redis monitoring tools work with Valkey
## Notes
This role follows the rick-infra infrastructure pattern where foundational services (Valkey, PostgreSQL) are provided as host-level services, and applications configure their own usage patterns rather than managing separate instances.
**Arch Linux Integration**: The role automatically works with Arch Linux's package system, which provides Valkey as the `redis` package with full Redis compatibility.