Joakim 4f8da38ca6 Add Nextcloud cloud storage role with split Redis caching strategy
## New Features

- **Nextcloud Role**: Complete cloud storage deployment using Podman Quadlet
  - FPM variant with Caddy reverse proxy and FastCGI
  - PostgreSQL database via Unix socket
  - Valkey/Redis for app-level caching and file locking
  - Automatic HTTPS with Let's Encrypt via Caddy
  - Dual-root pattern: Caddy serves static assets, FPM handles PHP

- **Split Caching Strategy**: Redis caching WITHOUT Redis sessions
  - Custom redis.config.php template for app-level caching only
  - File-based PHP sessions for stability (avoids session lock issues)
  - Prevents cascading failures from session lock contention
  - Documented in role README with detailed rationale

## Infrastructure Updates

- **Socket Permissions**: Update PostgreSQL and Valkey to mode 777
  - Required for containers that switch users (root → www-data)
  - Nextcloud container loses supplementary groups on user switch
  - Security maintained via password authentication (scram-sha-256, requirepass)
  - Documented socket permission architecture in docs/

- **PostgreSQL**: Export client group GID as fact for dependent roles
- **Valkey**: Export client group GID as fact, update socket fix service

## Documentation

- New: docs/socket-permissions-architecture.md
  - Explains 777 vs 770 socket permission trade-offs
  - Documents why group-based access doesn't work for user-switching containers
  - Provides TCP alternative for stricter security requirements

- Updated: All role READMEs with socket permission notes
- New: Nextcloud README with comprehensive deployment, troubleshooting, and Redis architecture documentation

## Configuration

- host_vars: Add Nextcloud vault variables and configuration
- site.yml: Include Nextcloud role in main playbook

## Technical Details

**Why disable Redis sessions?**

The official Nextcloud container enables Redis session handling via REDIS_HOST env var,
which causes severe performance issues:

1. Session lock contention under high concurrency (browser parallel asset requests)
2. Infinite lock retries (default lock_retries=-1) blocking FPM workers
3. Timeout orphaning: reverse proxy kills connection, worker keeps lock
4. Worker pool exhaustion: all 5 default workers blocked on same session lock
5. Cascading failure: new requests queue, more timeouts, more orphaned locks

Solution: Use file-based sessions (reliable, fast for single-server) while keeping
Redis for distributed cache and transactional file locking via custom config file.

This provides optimal performance without the complexity of Redis session debugging.

Tested: Fresh deployment on arch-vps (69.62.119.31)
Domain: https://cloud.jnss.me/
2025-12-14 22:07:08 +01:00
2025-11-12 20:48:28 +01:00

rick-infra

Infrastructure as Code for secure, high-performance web services with native databases, Unix socket IPC, and centralized authentication.

Architecture Overview

Rick-infra implements a security-first infrastructure stack featuring:

  • 🔒 Native Infrastructure: PostgreSQL, Valkey, Caddy managed by systemd for optimal performance
  • 🚀 Container Applications: Rootless Podman with systemd integration for secure application deployment
  • 🔐 Centralized Authentication: Authentik SSO with forward auth integration
  • 🔌 Unix Socket IPC: Zero network exposure for database and cache communication
  • 🛡️ Defense in Depth: Multi-layer security from network to application level
┌─────────────────────────────────────────────────────────────┐
│ rick-infra Security-First Architecture                     │
│                                                             │
│ ┌─────────────────┐  ┌─────────────────┐  ┌───────────────┐ │
│ │  Applications   │  │ Authentication  │  │ Reverse Proxy │ │
│ │ (Podman/systemd)│  │  (Authentik)    │  │ (Caddy/HTTPS) │ │
│ └─────────────────┘  └─────────────────┘  └───────────────┘ │
│           │                    │                    │       │
│           └────────────────────┼────────────────────┘       │
│                    ┌───────────┼───────────┐                │
│ ┌─────────────────┐│  ┌─────────▼────────┐  │┌──────────────┐│
│ │   PostgreSQL    ││  │      Valkey      │  ││    Podman    ││
│ │ (Native/systemd)││  │ (Native/systemd) │  ││(Containers)  ││
│ │  Unix Sockets   ││  │   Unix Sockets   │  ││  Rootless    ││
│ └─────────────────┘│  └──────────────────┘  │└──────────────┘│
│                    └─────────────────────────┘                │
└─────────────────────────────────────────────────────────────┘

Quick Start

Prerequisites

  • VPS: Fresh Arch Linux VPS with root access
  • DNS: Domain pointed to VPS IP address
  • SSH: Key-based authentication configured

Deploy Complete Stack

# 1. Clone repository
git clone https://github.com/your-username/rick-infra.git
cd rick-infra

# 2. Configure inventory
cp inventory/hosts.yml.example inventory/hosts.yml
# Edit inventory/hosts.yml with your VPS details

# 3. Set up vault variables
ansible-vault create host_vars/arch-vps/vault.yml
# Add required secrets (see deployment guide)

# 4. Deploy complete infrastructure
ansible-playbook -i inventory/hosts.yml site.yml --ask-vault-pass

Total deployment time: 8-14 minutes for complete stack

Verify Deployment

# Check services
curl -I https://auth.jnss.me/     # Authentik SSO
curl -I https://git.jnss.me/      # Gitea (if enabled)

# Check infrastructure
ansible arch-vps -m command -a "systemctl status postgresql valkey caddy"

Key Features

🔒 Security First

  • Native Database Services: No container attack vectors for critical infrastructure
  • Unix Socket IPC: Zero network exposure for database/cache communication
  • Rootless Containers: All applications run unprivileged
  • Centralized Authentication: SSO with MFA support via Authentik
  • Defense in Depth: Network, container, database, and application security layers

High Performance

  • Native Database Performance: No container overhead for PostgreSQL/Valkey
  • Unix Socket Communication: 20-40% faster than TCP for local IPC
  • Optimized Container Runtime: Podman with minimal overhead
  • CDN-Ready: Automatic HTTPS with Cloudflare integration

🛠️ Operational Excellence

  • Infrastructure as Code: Complete Ansible automation
  • systemd Integration: Native service management and monitoring
  • Comprehensive Monitoring: Centralized logging and metrics
  • Automated Backups: Database and configuration backup procedures

Documentation

📖 Getting Started

🏗️ Architecture & Decisions

🔧 Development & Integration

📚 Service Documentation

Core Services

Infrastructure Services (Native systemd)

  • PostgreSQL - High-performance database with Unix socket support
  • Valkey - Redis-compatible cache with Unix socket support
  • Caddy - Automatic HTTPS reverse proxy with Cloudflare DNS
  • Podman - Rootless container runtime with systemd integration

Authentication Services

  • Authentik - Modern SSO server with OAuth2/OIDC/SAML support
  • Forward Auth - Transparent service protection via Caddy integration
  • Multi-Factor Authentication - TOTP, WebAuthn, SMS support

Application Services (Containerized)

  • Gitea - Self-hosted Git service with SSO integration
  • Gallery - Media gallery with authentication
  • Custom Services - Template for additional service integration

Architecture Benefits

Why Native Databases?

Performance: Native PostgreSQL delivers 15-25% better performance than containerized alternatives

Security: Unix sockets eliminate network attack surface for database access

Operations: Standard system tools and procedures for backup, monitoring, maintenance

Reliability: systemd service management with proven restart and recovery mechanisms

Why Unix Socket IPC?

Security: No network exposure - access controlled by filesystem permissions

Performance: Lower latency and higher throughput than TCP communication

Simplicity: No network configuration, port management, or firewall rules

Why Rootless Containers?

Security: No privileged daemon, reduced attack surface

Isolation: Process isolation without compromising host security

Integration: Native systemd service management for containers

Contributing

Development Workflow

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/new-service
  3. Follow role template: Use existing roles as templates
  4. Test deployment: Verify on development environment
  5. Update documentation: Add service documentation
  6. Submit pull request: Include deployment testing results

Adding New Services

See the Service Integration Guide for complete instructions on adding new services to rick-infra.

Security Considerations

All contributions must follow the security-first principles:

  • Services must integrate with Authentik authentication
  • Database access must use Unix sockets only
  • Containers must run rootless with minimal privileges
  • All secrets must use Ansible Vault

License

MIT License - see LICENSE file for details.

Support

For issues, questions, or contributions:

  • Issues: GitHub Issues
  • Documentation: All guides linked above
  • Security: Follow responsible disclosure for security issues

rick-infra - Infrastructure as Code that prioritizes security, performance, and operational excellence.

Description
No description provided
Readme 452 KiB
Languages
Jinja 100%