398 lines
12 KiB
Markdown
398 lines
12 KiB
Markdown
# Nextcloud Cloud Storage Role
|
|
|
|
Self-contained Nextcloud deployment using Podman Quadlet with FPM, PostgreSQL database, and Valkey cache via Unix sockets.
|
|
|
|
## Features
|
|
|
|
- **Container**: Single Nextcloud FPM container via Podman Quadlet
|
|
- **Database**: Self-managed PostgreSQL database via Unix socket
|
|
- **Cache**: Valkey (Redis-compatible) for file locking and caching
|
|
- **Web Server**: Caddy reverse proxy with FastCGI and automatic HTTPS
|
|
- **Security**: Group-based socket access, separated data/config volumes
|
|
- **Size**: ~320MB FPM image (vs 1.1GB Apache variant)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Internet → Caddy (HTTPS:443) → FastCGI → Nextcloud FPM Container (127.0.0.1:9000)
|
|
↓ ↓
|
|
Serves static files PostgreSQL (socket)
|
|
from /opt/nextcloud/html Valkey (socket)
|
|
```
|
|
|
|
### Volume Layout
|
|
|
|
```
|
|
/opt/nextcloud/
|
|
├── html/ # Application code (755 - readable by Caddy for static files)
|
|
├── data/ # User files (700 - private to container)
|
|
├── config/ # Config with secrets (700 - private to container)
|
|
├── custom_apps/ # Installed apps (755 - readable)
|
|
└── .env # Environment variables (600)
|
|
```
|
|
|
|
**Security Model**:
|
|
- Caddy serves static assets (CSS/JS/images) directly from `/opt/nextcloud/html`
|
|
- Caddy cannot access `/data` or `/config` (mode 700)
|
|
- User files are only served through authenticated PHP requests via FPM
|
|
|
|
## Dependencies
|
|
|
|
- `postgresql` role (infrastructure)
|
|
- `valkey` role (infrastructure)
|
|
- `caddy` role (web server)
|
|
- `podman` role (container runtime)
|
|
|
|
## Variables
|
|
|
|
See `defaults/main.yml` for all configurable variables.
|
|
|
|
### Required Vault Variables
|
|
|
|
Define these in your `host_vars/` with `ansible-vault`:
|
|
|
|
```yaml
|
|
vault_nextcloud_db_password: "secure-database-password"
|
|
vault_nextcloud_admin_password: "secure-admin-password"
|
|
vault_valkey_password: "secure-valkey-password"
|
|
```
|
|
|
|
### Key Variables
|
|
|
|
```yaml
|
|
# Domain
|
|
nextcloud_domain: "cloud.jnss.me"
|
|
|
|
# Admin user
|
|
nextcloud_admin_user: "admin"
|
|
|
|
# Database
|
|
nextcloud_db_name: "nextcloud"
|
|
nextcloud_db_user: "nextcloud"
|
|
|
|
# Cache (use different DB number per service)
|
|
nextcloud_valkey_db: 2 # Authentik uses 1
|
|
|
|
# PHP limits
|
|
nextcloud_php_memory_limit: "512M"
|
|
nextcloud_php_upload_limit: "512M"
|
|
```
|
|
|
|
## Deployment Strategy
|
|
|
|
This role uses a **two-phase deployment** approach to work correctly with the Nextcloud container's initialization process:
|
|
|
|
### Phase 1: Container Initialization (automatic)
|
|
1. Create empty directories for volumes
|
|
2. Deploy environment configuration (`.env`)
|
|
3. Start Nextcloud container
|
|
4. Container entrypoint detects first-time setup (no `version.php`)
|
|
5. Container copies Nextcloud files to `/var/www/html/`
|
|
6. Container runs `occ maintenance:install` with PostgreSQL
|
|
7. Installation creates `config.php` with database credentials
|
|
|
|
### Phase 2: Custom Configuration (automatic)
|
|
8. Ansible waits for `occ status` to report `installed: true`
|
|
9. Ansible deploys custom `redis.config.php` (overwrites default)
|
|
10. Container restart applies custom configuration
|
|
|
|
**Why this order?**
|
|
|
|
The Nextcloud container's entrypoint uses `version.php` as a marker to determine if installation is needed. If you deploy any files into `/opt/nextcloud/config/` before the container starts, the initialization process fails:
|
|
|
|
- Container copies files including `version.php`
|
|
- Entrypoint sees `version.php` exists → assumes already installed
|
|
- Skips running `occ maintenance:install`
|
|
- Result: Empty `config.php`, 503 errors
|
|
|
|
By deploying custom configs **after** installation completes, we:
|
|
- ✅ Allow the container's auto-installation to run properly
|
|
- ✅ Override specific configs (like Redis) after the fact
|
|
- ✅ Maintain idempotency (subsequent runs just update configs)
|
|
|
|
See the official [Nextcloud Docker documentation](https://github.com/nextcloud/docker#auto-configuration-via-environment-variables) for more details on the auto-configuration process.
|
|
|
|
## Usage
|
|
|
|
### Include in Playbook
|
|
|
|
```yaml
|
|
- role: nextcloud
|
|
tags: ['nextcloud', 'cloud', 'storage']
|
|
```
|
|
|
|
### Deploy
|
|
|
|
```bash
|
|
# Deploy Nextcloud role
|
|
ansible-playbook -i inventory/hosts.yml site.yml --tags nextcloud --ask-vault-pass
|
|
|
|
# Deploy only infrastructure dependencies
|
|
ansible-playbook -i inventory/hosts.yml site.yml --tags postgresql,valkey,caddy
|
|
```
|
|
|
|
## Verification
|
|
|
|
After deployment:
|
|
|
|
1. **Access Nextcloud**:
|
|
```bash
|
|
https://cloud.jnss.me
|
|
```
|
|
|
|
2. **Check service status**:
|
|
```bash
|
|
ssh root@arch-vps
|
|
systemctl status nextcloud
|
|
podman ps | grep nextcloud
|
|
```
|
|
|
|
3. **View logs**:
|
|
```bash
|
|
# Container logs
|
|
journalctl -u nextcloud -f
|
|
podman logs nextcloud
|
|
|
|
# Caddy logs
|
|
tail -f /var/log/caddy/nextcloud.log
|
|
```
|
|
|
|
4. **Verify socket access**:
|
|
```bash
|
|
# Check group memberships
|
|
id nextcloud
|
|
# Should show: postgres-clients, valkey-clients
|
|
|
|
# Check socket permissions
|
|
ls -la /var/run/postgresql/.s.PGSQL.5432
|
|
ls -la /var/run/valkey/valkey.sock
|
|
```
|
|
|
|
## Maintenance
|
|
|
|
### OCC Command (Nextcloud CLI)
|
|
|
|
Run Nextcloud's OCC command-line tool:
|
|
|
|
```bash
|
|
# General syntax
|
|
podman exec --user www-data nextcloud php occ <command>
|
|
|
|
# Examples
|
|
podman exec --user www-data nextcloud php occ status
|
|
podman exec --user www-data nextcloud php occ app:list
|
|
podman exec --user www-data nextcloud php occ maintenance:mode --on
|
|
podman exec --user www-data nextcloud php occ files:scan --all
|
|
```
|
|
|
|
### Update Nextcloud
|
|
|
|
The container automatically updates on restart:
|
|
|
|
```bash
|
|
systemctl restart nextcloud
|
|
```
|
|
|
|
Or pull specific version:
|
|
|
|
```yaml
|
|
# In host_vars or defaults
|
|
nextcloud_version: "32-fpm" # Pin to major version
|
|
# Or
|
|
nextcloud_version: "32.0.3-fpm" # Pin to exact version
|
|
```
|
|
|
|
### Backup Strategy
|
|
|
|
Key directories to backup:
|
|
|
|
1. **User data**: `/opt/nextcloud/data`
|
|
2. **Configuration**: `/opt/nextcloud/config`
|
|
3. **Database**: PostgreSQL `nextcloud` database
|
|
4. **Custom apps**: `/opt/nextcloud/custom_apps` (optional)
|
|
|
|
Example backup script:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# Enable maintenance mode
|
|
podman exec --user www-data nextcloud php occ maintenance:mode --on
|
|
|
|
# Backup data and config
|
|
tar -czf nextcloud-data-$(date +%Y%m%d).tar.gz /opt/nextcloud/data /opt/nextcloud/config
|
|
|
|
# Backup database
|
|
sudo -u postgres pg_dump nextcloud > nextcloud-db-$(date +%Y%m%d).sql
|
|
|
|
# Disable maintenance mode
|
|
podman exec --user www-data nextcloud php occ maintenance:mode --off
|
|
```
|
|
|
|
### Performance Tuning
|
|
|
|
Adjust PHP limits in `host_vars`:
|
|
|
|
```yaml
|
|
nextcloud_php_memory_limit: "1G" # For large files
|
|
nextcloud_php_upload_limit: "10G" # For large uploads
|
|
```
|
|
|
|
### Redis/Valkey Caching Architecture
|
|
|
|
This role uses a **split caching strategy** for optimal performance and stability:
|
|
|
|
**PHP Sessions**: File-based (default PHP session handler)
|
|
- Location: `/var/www/html/data/sessions/`
|
|
- Why: Redis session locking can cause cascading failures under high concurrency
|
|
- Performance: Excellent for single-server deployments
|
|
|
|
**Nextcloud Application Cache**: Redis/Valkey
|
|
- `memcache.local`: APCu (in-memory opcode cache)
|
|
- `memcache.distributed`: Redis (shared cache, file locking)
|
|
- `memcache.locking`: Redis (transactional file locking)
|
|
- Configuration: Via custom `redis.config.php` template
|
|
|
|
**Why not Redis sessions?**
|
|
|
|
The official Nextcloud Docker image enables Redis session handling when `REDIS_HOST` is set. However, this can cause severe performance issues:
|
|
|
|
1. **Session lock contention**: Multiple parallel requests (browser loading CSS/JS/images) compete for the same session lock
|
|
2. **Infinite retries**: Default `lock_retries = -1` means workers block forever
|
|
3. **Timeout orphaning**: When reverse proxy times out, FPM workers keep running and hold locks
|
|
4. **Worker exhaustion**: Limited FPM workers (default 5) all become blocked
|
|
5. **Cascading failure**: New requests queue, timeouts accumulate, locks orphan
|
|
|
|
This role disables Redis sessions by **not setting** `REDIS_HOST` in the environment, while still providing Redis caching via a custom `redis.config.php` that is deployed independently.
|
|
|
|
**If you need Redis sessions** (e.g., multi-server setup with session sharing), you must:
|
|
1. Enable `REDIS_HOST` in `nextcloud.env.j2`
|
|
2. Add a custom PHP ini file with proper lock parameters:
|
|
- `redis.session.lock_expire = 30` (locks expire after 30 seconds)
|
|
- `redis.session.lock_retries = 100` (max 100 retries, not infinite)
|
|
- `redis.session.lock_wait_time = 50000` (50ms between retries)
|
|
3. Mount the ini file with `zz-` prefix to load after the entrypoint's redis-session.ini
|
|
4. Increase FPM workers significantly (15-20+)
|
|
5. Monitor for orphaned session locks
|
|
|
|
## Troubleshooting
|
|
|
|
### Container won't start
|
|
|
|
```bash
|
|
# Check container logs
|
|
journalctl -u nextcloud -n 50
|
|
podman logs nextcloud
|
|
|
|
# Check systemd unit
|
|
systemctl status nextcloud
|
|
```
|
|
|
|
### Permission errors
|
|
|
|
```bash
|
|
# Verify user groups
|
|
id nextcloud
|
|
|
|
# Should be in: postgres-clients, valkey-clients
|
|
# If not, re-run user.yml tasks:
|
|
ansible-playbook -i inventory/hosts.yml site.yml --tags nextcloud,user
|
|
```
|
|
|
|
### Database connection errors
|
|
|
|
```bash
|
|
# Test PostgreSQL socket
|
|
sudo -u nextcloud psql -h /var/run/postgresql -U nextcloud -d nextcloud
|
|
|
|
# Check socket exists and permissions
|
|
ls -la /var/run/postgresql/.s.PGSQL.5432
|
|
```
|
|
|
|
### Caddy FastCGI errors
|
|
|
|
```bash
|
|
# Check Caddy can read app files
|
|
sudo -u caddy ls -la /opt/nextcloud/html
|
|
|
|
# Verify FPM is listening
|
|
ss -tlnp | grep 9000
|
|
|
|
# Test FPM connection
|
|
curl -v http://127.0.0.1:9000
|
|
```
|
|
|
|
### "Trusted domain" errors
|
|
|
|
Add domains to `nextcloud_trusted_domains`:
|
|
|
|
```yaml
|
|
nextcloud_trusted_domains: "cloud.jnss.me localhost 69.62.119.31"
|
|
```
|
|
|
|
Or add via OCC:
|
|
|
|
```bash
|
|
podman exec --user www-data nextcloud php occ config:system:set trusted_domains 1 --value=cloud.jnss.me
|
|
```
|
|
|
|
## Integration with Authentik SSO
|
|
|
|
To integrate Nextcloud with Authentik for SSO, see the Authentik documentation for OAuth2/OIDC provider setup.
|
|
|
|
## Security Notes
|
|
|
|
- User data (`/opt/nextcloud/data`) is mode 700 - only container can access
|
|
- Config (`/opt/nextcloud/config`) is mode 700 - contains database passwords
|
|
- Application files (`/opt/nextcloud/html`) are mode 755 - Caddy can read for static files
|
|
- All traffic is HTTPS via Caddy with automatic Let's Encrypt certificates
|
|
- Database and cache connections use Unix sockets (no TCP exposure)
|
|
- Container runs as root initially, then switches to www-data (UID 33) for PHP-FPM
|
|
|
|
### Socket Access Pattern
|
|
|
|
Nextcloud uses a different access pattern than other rick-infra services due to how the official Nextcloud container works:
|
|
|
|
**How it works:**
|
|
1. Container starts as root (UID 0)
|
|
2. Entrypoint runs as root to write PHP configuration files
|
|
3. Entrypoint switches to www-data (UID 33) for PHP-FPM process
|
|
4. www-data accesses PostgreSQL and Valkey via Unix sockets
|
|
|
|
**Why 777 socket permissions are needed:**
|
|
- The Nextcloud container cannot use `--group-add` effectively because:
|
|
- `--group-add` only adds groups to the **initial user** (root)
|
|
- When the container switches from root to www-data, supplementary groups are lost
|
|
- www-data (UID 33, GID 33) ends up with no access to group-restricted sockets
|
|
- Infrastructure sockets use mode 777 to allow access by any UID
|
|
- Security is maintained via password authentication (PostgreSQL: scram-sha-256, Valkey: requirepass)
|
|
- Sockets are local-only (not network-exposed)
|
|
|
|
**Alternative (TCP)**:
|
|
If you prefer group-based socket access (770), you can configure PostgreSQL and Valkey to use TCP instead:
|
|
|
|
```yaml
|
|
# In host_vars
|
|
postgresql_listen_addresses: "127.0.0.1"
|
|
postgresql_unix_socket_permissions: "0770" # Restrict to group
|
|
|
|
valkey_bind: "127.0.0.1"
|
|
valkey_port: 6379
|
|
valkey_unix_socket_enabled: false
|
|
|
|
# In Nextcloud env
|
|
POSTGRES_HOST=127.0.0.1
|
|
POSTGRES_PORT=5432
|
|
REDIS_HOST=127.0.0.1
|
|
REDIS_PORT=6379
|
|
```
|
|
|
|
This provides the same security level (password-authenticated, localhost-only) but uses TCP instead of Unix sockets. The trade-off is slightly lower performance compared to Unix sockets.
|
|
|
|
See infrastructure role documentation (PostgreSQL and Valkey READMEs) for more details on this architectural decision.
|
|
|
|
## References
|
|
|
|
- [Nextcloud Official Docker Image](https://hub.docker.com/_/nextcloud)
|
|
- [Nextcloud Documentation](https://docs.nextcloud.com/)
|
|
- [Caddy FastCGI Documentation](https://caddyserver.com/docs/caddyfile/directives/php_fastcgi)
|