Add metrics monitoring stack with VictoriaMetrics, Grafana, and node_exporter

Implement complete monitoring infrastructure following rick-infra principles:

Components:
- VictoriaMetrics: Prometheus-compatible TSDB (7x less RAM usage)
- Grafana: Visualization dashboard with Authentik OAuth/OIDC integration
- node_exporter: System metrics collection (CPU, memory, disk, network)

Architecture:
- All services run as native systemd binaries (no containers)
- localhost-only binding for security
- Grafana uses native OAuth integration with Authentik (not forward_auth)
- Full systemd security hardening enabled
- Proxied via Caddy at metrics.jnss.me with HTTPS

Role Features:
- Unified metrics role (single role for complete stack)
- Automatic role mapping via Authentik groups:
  - authentik Admins OR grafana-admins -> Admin access
  - grafana-editors -> Editor access
  - All others -> Viewer access
- VictoriaMetrics auto-provisioned as default Grafana datasource
- 12-month metrics retention by default
- Comprehensive documentation included

Security:
- OAuth/OIDC SSO via Authentik
- All metrics services bind to 127.0.0.1 only
- systemd hardening (NoNewPrivileges, ProtectSystem, etc.)
- Grafana accessible only via Caddy HTTPS proxy

Documentation:
- roles/metrics/README.md: Complete role documentation
- docs/metrics-deployment-guide.md: Step-by-step deployment guide

Configuration:
- Updated rick-infra.yml to include metrics deployment
- Grafana port set to 3001 (Gitea uses 3000)
- Ready for multi-host expansion (designed for future node_exporter deployment to production hosts)
This commit is contained in:
2025-12-28 19:18:30 +01:00
parent eca3b4a726
commit 1f3f111d88
19 changed files with 1345 additions and 4 deletions

View File

@@ -0,0 +1,66 @@
---
- name: Create VictoriaMetrics system user
ansible.builtin.user:
name: "{{ victoriametrics_user }}"
system: true
create_home: false
shell: /usr/sbin/nologin
state: present
- name: Create VictoriaMetrics directories
ansible.builtin.file:
path: "{{ item }}"
state: directory
owner: "{{ victoriametrics_user }}"
group: "{{ victoriametrics_group }}"
mode: '0755'
loop:
- "{{ victoriametrics_data_dir }}"
- "{{ victoriametrics_scrape_config_dir }}"
- name: Download VictoriaMetrics binary
ansible.builtin.get_url:
url: "https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v{{ victoriametrics_version }}/victoria-metrics-linux-amd64-v{{ victoriametrics_version }}.tar.gz"
dest: "/tmp/victoria-metrics-v{{ victoriametrics_version }}.tar.gz"
mode: '0644'
register: victoriametrics_download
- name: Extract VictoriaMetrics binary
ansible.builtin.unarchive:
src: "/tmp/victoria-metrics-v{{ victoriametrics_version }}.tar.gz"
dest: /usr/local/bin
remote_src: true
creates: /usr/local/bin/victoria-metrics-prod
when: victoriametrics_download.changed
- name: Set VictoriaMetrics binary permissions
ansible.builtin.file:
path: /usr/local/bin/victoria-metrics-prod
owner: root
group: root
mode: '0755'
- name: Deploy VictoriaMetrics scrape configuration
ansible.builtin.template:
src: scrape.yml.j2
dest: "{{ victoriametrics_scrape_config_file }}"
owner: "{{ victoriametrics_user }}"
group: "{{ victoriametrics_group }}"
mode: '0644'
notify: restart victoriametrics
- name: Deploy VictoriaMetrics systemd service
ansible.builtin.template:
src: victoriametrics.service.j2
dest: /etc/systemd/system/victoriametrics.service
owner: root
group: root
mode: '0644'
notify: restart victoriametrics
- name: Enable and start VictoriaMetrics service
ansible.builtin.systemd:
name: victoriametrics
enabled: "{{ victoriametrics_service_enabled }}"
state: "{{ victoriametrics_service_state }}"
daemon_reload: true