Add metrics monitoring stack with VictoriaMetrics, Grafana, and node_exporter
Implement complete monitoring infrastructure following rick-infra principles: Components: - VictoriaMetrics: Prometheus-compatible TSDB (7x less RAM usage) - Grafana: Visualization dashboard with Authentik OAuth/OIDC integration - node_exporter: System metrics collection (CPU, memory, disk, network) Architecture: - All services run as native systemd binaries (no containers) - localhost-only binding for security - Grafana uses native OAuth integration with Authentik (not forward_auth) - Full systemd security hardening enabled - Proxied via Caddy at metrics.jnss.me with HTTPS Role Features: - Unified metrics role (single role for complete stack) - Automatic role mapping via Authentik groups: - authentik Admins OR grafana-admins -> Admin access - grafana-editors -> Editor access - All others -> Viewer access - VictoriaMetrics auto-provisioned as default Grafana datasource - 12-month metrics retention by default - Comprehensive documentation included Security: - OAuth/OIDC SSO via Authentik - All metrics services bind to 127.0.0.1 only - systemd hardening (NoNewPrivileges, ProtectSystem, etc.) - Grafana accessible only via Caddy HTTPS proxy Documentation: - roles/metrics/README.md: Complete role documentation - docs/metrics-deployment-guide.md: Step-by-step deployment guide Configuration: - Updated rick-infra.yml to include metrics deployment - Grafana port set to 3001 (Gitea uses 3000) - Ready for multi-host expansion (designed for future node_exporter deployment to production hosts)
This commit is contained in:
325
roles/metrics/README.md
Normal file
325
roles/metrics/README.md
Normal file
@@ -0,0 +1,325 @@
|
||||
# Metrics Role
|
||||
|
||||
Complete monitoring stack for rick-infra providing system metrics collection, storage, and visualization with SSO integration.
|
||||
|
||||
## Components
|
||||
|
||||
### VictoriaMetrics
|
||||
- **Purpose**: Time-series database for metrics storage
|
||||
- **Type**: Native systemd service
|
||||
- **Listen**: `127.0.0.1:8428` (localhost only)
|
||||
- **Features**:
|
||||
- Prometheus-compatible API and PromQL
|
||||
- 7x less RAM usage than Prometheus
|
||||
- Single binary deployment
|
||||
- 12-month data retention by default
|
||||
|
||||
### Grafana
|
||||
- **Purpose**: Metrics visualization and dashboarding
|
||||
- **Type**: Native systemd service
|
||||
- **Listen**: `127.0.0.1:3000` (localhost only, proxied via Caddy)
|
||||
- **Domain**: `metrics.jnss.me`
|
||||
- **Features**:
|
||||
- OAuth/OIDC integration with Authentik
|
||||
- Role-based access control via Authentik groups
|
||||
- VictoriaMetrics as default data source
|
||||
|
||||
### node_exporter
|
||||
- **Purpose**: System metrics collection
|
||||
- **Type**: Native systemd service
|
||||
- **Listen**: `127.0.0.1:9100` (localhost only)
|
||||
- **Metrics**: CPU, memory, disk, network, systemd units
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ metrics.jnss.me (Grafana Dashboard) │
|
||||
│ ┌─────────────────────────────────────────────────┐ │
|
||||
│ │ Caddy (HTTPS) │ │
|
||||
│ │ ↓ │ │
|
||||
│ │ Grafana (OAuth → Authentik) │ │
|
||||
│ │ ↓ │ │
|
||||
│ │ VictoriaMetrics (Prometheus-compatible) │ │
|
||||
│ │ ↑ │ │
|
||||
│ │ node_exporter (System Metrics) │ │
|
||||
│ └─────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Caddy role deployed** - Required for HTTPS proxy
|
||||
2. **Authentik deployed** - Required for OAuth/SSO
|
||||
3. **Vault variables configured**:
|
||||
```yaml
|
||||
# In host_vars/arch-vps/vault.yml
|
||||
vault_grafana_admin_password: "secure-admin-password"
|
||||
vault_grafana_secret_key: "random-secret-key-32-chars"
|
||||
vault_grafana_oauth_client_id: "grafana"
|
||||
vault_grafana_oauth_client_secret: "oauth-client-secret-from-authentik"
|
||||
```
|
||||
|
||||
### Authentik Configuration
|
||||
|
||||
Before deployment, create OAuth2/OIDC provider in Authentik:
|
||||
|
||||
1. **Create Provider**:
|
||||
- Name: `Grafana`
|
||||
- Type: `OAuth2/OpenID Provider`
|
||||
- Client ID: `grafana`
|
||||
- Client Secret: Generate and save to vault
|
||||
- Redirect URIs: `https://metrics.jnss.me/login/generic_oauth`
|
||||
- Signing Key: Auto-generated
|
||||
|
||||
2. **Create Application**:
|
||||
- Name: `Grafana`
|
||||
- Slug: `grafana`
|
||||
- Provider: Select Grafana provider created above
|
||||
|
||||
3. **Create Groups** (optional, for role mapping):
|
||||
- `grafana-admins` - Full admin access
|
||||
- `grafana-editors` - Can create/edit dashboards
|
||||
- Users without these groups get Viewer access
|
||||
|
||||
### Deploy
|
||||
|
||||
```bash
|
||||
# Deploy complete metrics stack
|
||||
ansible-playbook rick-infra.yml --tags metrics
|
||||
|
||||
# Deploy individual components
|
||||
ansible-playbook rick-infra.yml --tags victoriametrics
|
||||
ansible-playbook rick-infra.yml --tags grafana
|
||||
ansible-playbook rick-infra.yml --tags node_exporter
|
||||
```
|
||||
|
||||
### Verify Deployment
|
||||
|
||||
```bash
|
||||
# Check service status
|
||||
ansible homelab -a "systemctl status victoriametrics grafana node_exporter"
|
||||
|
||||
# Check metrics collection
|
||||
curl http://127.0.0.1:9100/metrics # node_exporter metrics
|
||||
curl http://127.0.0.1:8428/metrics # VictoriaMetrics metrics
|
||||
curl http://127.0.0.1:8428/api/v1/targets # Scrape targets
|
||||
|
||||
# Access Grafana
|
||||
curl -I https://metrics.jnss.me/ # Should redirect to Authentik login
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Access Dashboard
|
||||
|
||||
1. Navigate to `https://metrics.jnss.me`
|
||||
2. Click "Sign in with Authentik"
|
||||
3. Authenticate via Authentik SSO
|
||||
4. Access granted based on Authentik group membership
|
||||
|
||||
### Role Mapping
|
||||
|
||||
Grafana roles are automatically assigned based on Authentik groups:
|
||||
|
||||
- **Admin**: Members of `grafana-admins` group
|
||||
- Full administrative access
|
||||
- Can manage users, data sources, plugins
|
||||
- Can create/edit/delete all dashboards
|
||||
|
||||
- **Editor**: Members of `grafana-editors` group
|
||||
- Can create and edit dashboards
|
||||
- Cannot manage users or data sources
|
||||
|
||||
- **Viewer**: All other authenticated users
|
||||
- Read-only access to dashboards
|
||||
- Cannot create or edit dashboards
|
||||
|
||||
### Creating Dashboards
|
||||
|
||||
Grafana comes with VictoriaMetrics pre-configured as the default data source. Use PromQL queries:
|
||||
|
||||
```promql
|
||||
# CPU usage
|
||||
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
|
||||
|
||||
# Memory usage
|
||||
node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes
|
||||
|
||||
# Disk usage
|
||||
100 - ((node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100)
|
||||
|
||||
# Network traffic
|
||||
irate(node_network_receive_bytes_total[5m])
|
||||
```
|
||||
|
||||
### Import Community Dashboards
|
||||
|
||||
1. Browse dashboards at https://grafana.com/grafana/dashboards/
|
||||
2. Recommended for node_exporter:
|
||||
- Dashboard ID: 1860 (Node Exporter Full)
|
||||
- Dashboard ID: 11074 (Node Exporter for Prometheus)
|
||||
3. Import via Grafana UI: Dashboards → Import → Enter ID
|
||||
|
||||
## Configuration
|
||||
|
||||
### Customization
|
||||
|
||||
Key configuration options in `roles/metrics/defaults/main.yml`:
|
||||
|
||||
```yaml
|
||||
# Data retention
|
||||
victoriametrics_retention_period: "12" # months
|
||||
|
||||
# Scrape interval
|
||||
victoriametrics_scrape_interval: "15s"
|
||||
|
||||
# OAuth role mapping (JMESPath expression)
|
||||
grafana_oauth_role_attribute_path: "contains(groups, 'grafana-admins') && 'Admin' || contains(groups, 'grafana-editors') && 'Editor' || 'Viewer'"
|
||||
|
||||
# Memory limits
|
||||
victoriametrics_memory_allowed_percent: "60"
|
||||
```
|
||||
|
||||
### Adding Scrape Targets
|
||||
|
||||
Edit `roles/metrics/templates/scrape.yml.j2`:
|
||||
|
||||
```yaml
|
||||
scrape_configs:
|
||||
# Add custom application metrics
|
||||
- job_name: 'myapp'
|
||||
static_configs:
|
||||
- targets: ['127.0.0.1:8080']
|
||||
labels:
|
||||
service: 'myapp'
|
||||
```
|
||||
|
||||
## Operations
|
||||
|
||||
### Service Management
|
||||
|
||||
```bash
|
||||
# VictoriaMetrics
|
||||
systemctl status victoriametrics
|
||||
systemctl restart victoriametrics
|
||||
journalctl -u victoriametrics -f
|
||||
|
||||
# Grafana
|
||||
systemctl status grafana
|
||||
systemctl restart grafana
|
||||
journalctl -u grafana -f
|
||||
|
||||
# node_exporter
|
||||
systemctl status node_exporter
|
||||
systemctl restart node_exporter
|
||||
journalctl -u node_exporter -f
|
||||
```
|
||||
|
||||
### Data Locations
|
||||
|
||||
```
|
||||
/var/lib/victoriametrics/ # Time-series data
|
||||
/var/lib/grafana/ # Grafana database and dashboards
|
||||
/var/log/grafana/ # Grafana logs
|
||||
/etc/victoriametrics/ # VictoriaMetrics config
|
||||
/etc/grafana/ # Grafana config
|
||||
```
|
||||
|
||||
### Backup
|
||||
|
||||
VictoriaMetrics data is stored in `/var/lib/victoriametrics`:
|
||||
|
||||
```bash
|
||||
# Stop service
|
||||
systemctl stop victoriametrics
|
||||
|
||||
# Backup data
|
||||
tar -czf victoriametrics-backup-$(date +%Y%m%d).tar.gz /var/lib/victoriametrics
|
||||
|
||||
# Start service
|
||||
systemctl start victoriametrics
|
||||
```
|
||||
|
||||
Grafana dashboards are stored in SQLite database at `/var/lib/grafana/grafana.db`:
|
||||
|
||||
```bash
|
||||
# Backup Grafana
|
||||
systemctl stop grafana
|
||||
tar -czf grafana-backup-$(date +%Y%m%d).tar.gz /var/lib/grafana /etc/grafana
|
||||
systemctl start grafana
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
### Authentication
|
||||
- Grafana protected by Authentik OAuth/OIDC
|
||||
- Local admin account available for emergency access
|
||||
- All services bind to localhost only
|
||||
|
||||
### Network Security
|
||||
- VictoriaMetrics: `127.0.0.1:8428` (no external access)
|
||||
- Grafana: `127.0.0.1:3000` (proxied via Caddy with HTTPS)
|
||||
- node_exporter: `127.0.0.1:9100` (no external access)
|
||||
|
||||
### systemd Hardening
|
||||
All services run with security restrictions:
|
||||
- `NoNewPrivileges=true`
|
||||
- `ProtectSystem=strict`
|
||||
- `ProtectHome=true`
|
||||
- `PrivateTmp=true`
|
||||
- Read-only filesystem (except data directories)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Grafana OAuth Not Working
|
||||
|
||||
1. Check Authentik provider configuration:
|
||||
```bash
|
||||
# Verify redirect URI matches
|
||||
# https://metrics.jnss.me/login/generic_oauth
|
||||
```
|
||||
|
||||
2. Check Grafana logs:
|
||||
```bash
|
||||
journalctl -u grafana -f
|
||||
```
|
||||
|
||||
3. Verify OAuth credentials in vault match Authentik
|
||||
|
||||
### No Metrics in Grafana
|
||||
|
||||
1. Check VictoriaMetrics scrape targets:
|
||||
```bash
|
||||
curl http://127.0.0.1:8428/api/v1/targets
|
||||
```
|
||||
|
||||
2. Check node_exporter is running:
|
||||
```bash
|
||||
systemctl status node_exporter
|
||||
curl http://127.0.0.1:9100/metrics
|
||||
```
|
||||
|
||||
3. Check VictoriaMetrics logs:
|
||||
```bash
|
||||
journalctl -u victoriametrics -f
|
||||
```
|
||||
|
||||
### High Memory Usage
|
||||
|
||||
VictoriaMetrics is configured to use max 60% of available memory. Adjust if needed:
|
||||
|
||||
```yaml
|
||||
# In roles/metrics/defaults/main.yml
|
||||
victoriametrics_memory_allowed_percent: "40" # Reduce to 40%
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [VictoriaMetrics Documentation](https://docs.victoriametrics.com/)
|
||||
- [Grafana Documentation](https://grafana.com/docs/)
|
||||
- [node_exporter GitHub](https://github.com/prometheus/node_exporter)
|
||||
- [PromQL Documentation](https://prometheus.io/docs/prometheus/latest/querying/basics/)
|
||||
- [Authentik OAuth Integration](https://goauthentik.io/docs/providers/oauth2/)
|
||||
178
roles/metrics/defaults/main.yml
Normal file
178
roles/metrics/defaults/main.yml
Normal file
@@ -0,0 +1,178 @@
|
||||
---
|
||||
# =================================================================
|
||||
# Metrics Infrastructure Role - Complete Monitoring Stack
|
||||
# =================================================================
|
||||
# Provides VictoriaMetrics, Grafana, and node_exporter as unified stack
|
||||
|
||||
# =================================================================
|
||||
# VictoriaMetrics Configuration
|
||||
# =================================================================
|
||||
|
||||
# Service Management
|
||||
victoriametrics_service_enabled: true
|
||||
victoriametrics_service_state: "started"
|
||||
|
||||
# Version
|
||||
victoriametrics_version: "1.105.0"
|
||||
|
||||
# Network Security (localhost only)
|
||||
victoriametrics_listen_address: "127.0.0.1:8428"
|
||||
|
||||
# Storage Configuration
|
||||
victoriametrics_data_dir: "/var/lib/victoriametrics"
|
||||
victoriametrics_retention_period: "12" # months
|
||||
|
||||
# User/Group
|
||||
victoriametrics_user: "victoriametrics"
|
||||
victoriametrics_group: "victoriametrics"
|
||||
|
||||
# Performance Settings
|
||||
victoriametrics_memory_allowed_percent: "30"
|
||||
victoriametrics_storage_min_free_disk_space_bytes: "10GB"
|
||||
|
||||
# Scrape Configuration
|
||||
victoriametrics_scrape_config_dir: "/etc/victoriametrics"
|
||||
victoriametrics_scrape_config_file: "{{ victoriametrics_scrape_config_dir }}/scrape.yml"
|
||||
victoriametrics_scrape_interval: "15s"
|
||||
victoriametrics_scrape_timeout: "10s"
|
||||
|
||||
# systemd security
|
||||
victoriametrics_systemd_security: true
|
||||
|
||||
# =================================================================
|
||||
# Grafana Configuration
|
||||
# =================================================================
|
||||
|
||||
# Service Management
|
||||
grafana_service_enabled: true
|
||||
grafana_service_state: "started"
|
||||
|
||||
# Version
|
||||
grafana_version: "11.4.0"
|
||||
|
||||
# Network Security (localhost only - proxied via Caddy)
|
||||
grafana_listen_address: "127.0.0.1"
|
||||
grafana_listen_port: 3420
|
||||
|
||||
# User/Group
|
||||
grafana_user: "grafana"
|
||||
grafana_group: "grafana"
|
||||
|
||||
# Directories
|
||||
grafana_data_dir: "/var/lib/grafana"
|
||||
grafana_logs_dir: "/var/log/grafana"
|
||||
grafana_plugins_dir: "/var/lib/grafana/plugins"
|
||||
grafana_provisioning_dir: "/etc/grafana/provisioning"
|
||||
|
||||
# Domain Configuration
|
||||
grafana_domain: "metrics.{{ caddy_domain }}"
|
||||
grafana_root_url: "https://{{ grafana_domain }}"
|
||||
|
||||
# Default admin (used only for initial setup)
|
||||
grafana_admin_user: "admin"
|
||||
grafana_admin_password: "{{ vault_grafana_admin_password }}"
|
||||
|
||||
# Disable registration (OAuth only)
|
||||
grafana_allow_signup: false
|
||||
grafana_disable_login_form: false # Keep fallback login
|
||||
|
||||
# OAuth/OIDC Configuration (Authentik)
|
||||
grafana_oauth_enabled: true
|
||||
grafana_oauth_name: "Authentik"
|
||||
grafana_oauth_client_id: "{{ vault_grafana_oauth_client_id }}"
|
||||
grafana_oauth_client_secret: "{{ vault_grafana_oauth_client_secret }}"
|
||||
|
||||
# Authentik OAuth endpoints
|
||||
grafana_oauth_auth_url: "https://{{ authentik_domain }}/application/o/authorize/"
|
||||
grafana_oauth_token_url: "https://{{ authentik_domain }}/application/o/token/"
|
||||
grafana_oauth_api_url: "https://{{ authentik_domain }}/application/o/userinfo/"
|
||||
|
||||
# OAuth role mapping
|
||||
grafana_oauth_role_attribute_path: "(contains(groups, 'authentik Admins') || contains(groups, 'grafana-admins')) && 'Admin' || contains(groups, 'grafana-editors') && 'Editor' || 'Viewer'"
|
||||
grafana_oauth_allow_sign_up: true # Auto-create users from OAuth
|
||||
grafana_oauth_scopes: "openid profile email groups"
|
||||
|
||||
# Data Source Configuration
|
||||
grafana_datasource_vm_enabled: true
|
||||
grafana_datasource_vm_url: "http://{{ victoriametrics_listen_address }}"
|
||||
grafana_datasource_vm_name: "VictoriaMetrics"
|
||||
|
||||
# Security
|
||||
grafana_systemd_security: true
|
||||
grafana_cookie_secure: true
|
||||
grafana_cookie_samesite: "lax"
|
||||
|
||||
# Database (SQLite by default)
|
||||
grafana_database_type: "sqlite3"
|
||||
grafana_database_path: "{{ grafana_data_dir }}/grafana.db"
|
||||
|
||||
# =================================================================
|
||||
# Node Exporter Configuration
|
||||
# =================================================================
|
||||
|
||||
# Service Management
|
||||
node_exporter_service_enabled: true
|
||||
node_exporter_service_state: "started"
|
||||
|
||||
# Version
|
||||
node_exporter_version: "1.8.2"
|
||||
|
||||
# Network Security (localhost only)
|
||||
node_exporter_listen_address: "127.0.0.1:9100"
|
||||
|
||||
# User/Group
|
||||
node_exporter_user: "node_exporter"
|
||||
node_exporter_group: "node_exporter"
|
||||
|
||||
# Enabled collectors
|
||||
node_exporter_enabled_collectors:
|
||||
- cpu
|
||||
- diskstats
|
||||
- filesystem
|
||||
- loadavg
|
||||
- meminfo
|
||||
- netdev
|
||||
- netstat
|
||||
- stat
|
||||
- time
|
||||
- uname
|
||||
- vmstat
|
||||
- systemd
|
||||
|
||||
# Disabled collectors
|
||||
node_exporter_disabled_collectors:
|
||||
- mdadm
|
||||
|
||||
# Filesystem collector configuration
|
||||
node_exporter_filesystem_ignored_fs_types:
|
||||
- tmpfs
|
||||
- devtmpfs
|
||||
- devfs
|
||||
- iso9660
|
||||
- overlay
|
||||
- aufs
|
||||
- squashfs
|
||||
|
||||
node_exporter_filesystem_ignored_mount_points:
|
||||
- /var/lib/containers/storage/.*
|
||||
- /run/.*
|
||||
- /sys/.*
|
||||
- /proc/.*
|
||||
|
||||
# systemd security
|
||||
node_exporter_systemd_security: true
|
||||
|
||||
# =================================================================
|
||||
# Infrastructure Notes
|
||||
# =================================================================
|
||||
# Complete monitoring stack:
|
||||
# - VictoriaMetrics: Time-series database (Prometheus-compatible)
|
||||
# - Grafana: Visualization with Authentik OAuth integration
|
||||
# - node_exporter: System metrics collection
|
||||
#
|
||||
# Role mapping via Authentik groups:
|
||||
# - grafana-admins: Full admin access
|
||||
# - grafana-editors: Can create/edit dashboards
|
||||
# - Default: Viewer access
|
||||
#
|
||||
# All services run on localhost only, proxied via Caddy
|
||||
23
roles/metrics/handlers/main.yml
Normal file
23
roles/metrics/handlers/main.yml
Normal file
@@ -0,0 +1,23 @@
|
||||
---
|
||||
- name: restart victoriametrics
|
||||
ansible.builtin.systemd:
|
||||
name: victoriametrics
|
||||
state: restarted
|
||||
daemon_reload: true
|
||||
|
||||
- name: restart node_exporter
|
||||
ansible.builtin.systemd:
|
||||
name: node_exporter
|
||||
state: restarted
|
||||
daemon_reload: true
|
||||
|
||||
- name: restart grafana
|
||||
ansible.builtin.systemd:
|
||||
name: grafana
|
||||
state: restarted
|
||||
daemon_reload: true
|
||||
|
||||
- name: reload caddy
|
||||
ansible.builtin.systemd:
|
||||
name: caddy
|
||||
state: reloaded
|
||||
3
roles/metrics/meta/main.yml
Normal file
3
roles/metrics/meta/main.yml
Normal file
@@ -0,0 +1,3 @@
|
||||
---
|
||||
dependencies:
|
||||
- role: caddy
|
||||
9
roles/metrics/tasks/caddy.yml
Normal file
9
roles/metrics/tasks/caddy.yml
Normal file
@@ -0,0 +1,9 @@
|
||||
---
|
||||
- name: Deploy Grafana Caddy configuration
|
||||
ansible.builtin.template:
|
||||
src: grafana.caddy.j2
|
||||
dest: /etc/caddy/sites-enabled/grafana.caddy
|
||||
owner: caddy
|
||||
group: caddy
|
||||
mode: '0644'
|
||||
notify: reload caddy
|
||||
90
roles/metrics/tasks/grafana.yml
Normal file
90
roles/metrics/tasks/grafana.yml
Normal file
@@ -0,0 +1,90 @@
|
||||
---
|
||||
- name: Create Grafana system user
|
||||
ansible.builtin.user:
|
||||
name: "{{ grafana_user }}"
|
||||
system: true
|
||||
create_home: false
|
||||
shell: /usr/sbin/nologin
|
||||
state: present
|
||||
|
||||
- name: Create Grafana directories
|
||||
ansible.builtin.file:
|
||||
path: "{{ item }}"
|
||||
state: directory
|
||||
owner: "{{ grafana_user }}"
|
||||
group: "{{ grafana_group }}"
|
||||
mode: '0755'
|
||||
loop:
|
||||
- "{{ grafana_data_dir }}"
|
||||
- "{{ grafana_logs_dir }}"
|
||||
- "{{ grafana_plugins_dir }}"
|
||||
- "{{ grafana_provisioning_dir }}"
|
||||
- "{{ grafana_provisioning_dir }}/datasources"
|
||||
- "{{ grafana_provisioning_dir }}/dashboards"
|
||||
- "{{ grafana_data_dir }}/dashboards"
|
||||
- /etc/grafana
|
||||
|
||||
- name: Download Grafana binary
|
||||
ansible.builtin.get_url:
|
||||
url: "https://dl.grafana.com/oss/release/grafana-{{ grafana_version }}.linux-amd64.tar.gz"
|
||||
dest: "/tmp/grafana-{{ grafana_version }}.tar.gz"
|
||||
mode: '0644'
|
||||
register: grafana_download
|
||||
|
||||
- name: Extract Grafana
|
||||
ansible.builtin.unarchive:
|
||||
src: "/tmp/grafana-{{ grafana_version }}.tar.gz"
|
||||
dest: /opt
|
||||
remote_src: true
|
||||
creates: "/opt/grafana-v{{ grafana_version }}"
|
||||
when: grafana_download.changed
|
||||
|
||||
- name: Create Grafana symlink
|
||||
ansible.builtin.file:
|
||||
src: "/opt/grafana-v{{ grafana_version }}"
|
||||
dest: /opt/grafana
|
||||
state: link
|
||||
|
||||
- name: Deploy Grafana configuration
|
||||
ansible.builtin.template:
|
||||
src: grafana.ini.j2
|
||||
dest: /etc/grafana/grafana.ini
|
||||
owner: "{{ grafana_user }}"
|
||||
group: "{{ grafana_group }}"
|
||||
mode: '0640'
|
||||
notify: restart grafana
|
||||
|
||||
- name: Deploy VictoriaMetrics datasource provisioning
|
||||
ansible.builtin.template:
|
||||
src: datasource-victoriametrics.yml.j2
|
||||
dest: "{{ grafana_provisioning_dir }}/datasources/victoriametrics.yml"
|
||||
owner: "{{ grafana_user }}"
|
||||
group: "{{ grafana_group }}"
|
||||
mode: '0644'
|
||||
notify: restart grafana
|
||||
when: grafana_datasource_vm_enabled
|
||||
|
||||
- name: Deploy dashboard provisioning
|
||||
ansible.builtin.template:
|
||||
src: dashboards.yml.j2
|
||||
dest: "{{ grafana_provisioning_dir }}/dashboards/default.yml"
|
||||
owner: "{{ grafana_user }}"
|
||||
group: "{{ grafana_group }}"
|
||||
mode: '0644'
|
||||
notify: restart grafana
|
||||
|
||||
- name: Deploy Grafana systemd service
|
||||
ansible.builtin.template:
|
||||
src: grafana.service.j2
|
||||
dest: /etc/systemd/system/grafana.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0644'
|
||||
notify: restart grafana
|
||||
|
||||
- name: Enable and start Grafana service
|
||||
ansible.builtin.systemd:
|
||||
name: grafana
|
||||
enabled: "{{ grafana_service_enabled }}"
|
||||
state: "{{ grafana_service_state }}"
|
||||
daemon_reload: true
|
||||
20
roles/metrics/tasks/main.yml
Normal file
20
roles/metrics/tasks/main.yml
Normal file
@@ -0,0 +1,20 @@
|
||||
---
|
||||
# =================================================================
|
||||
# Metrics Stack Deployment
|
||||
# =================================================================
|
||||
|
||||
- name: Deploy VictoriaMetrics
|
||||
ansible.builtin.include_tasks: victoriametrics.yml
|
||||
tags: [metrics, victoriametrics]
|
||||
|
||||
- name: Deploy node_exporter
|
||||
ansible.builtin.include_tasks: node_exporter.yml
|
||||
tags: [metrics, node_exporter]
|
||||
|
||||
- name: Deploy Grafana
|
||||
ansible.builtin.include_tasks: grafana.yml
|
||||
tags: [metrics, grafana]
|
||||
|
||||
- name: Deploy Caddy configuration for Grafana
|
||||
ansible.builtin.include_tasks: caddy.yml
|
||||
tags: [metrics, caddy]
|
||||
49
roles/metrics/tasks/node_exporter.yml
Normal file
49
roles/metrics/tasks/node_exporter.yml
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
- name: Create node_exporter system user
|
||||
ansible.builtin.user:
|
||||
name: "{{ node_exporter_user }}"
|
||||
system: true
|
||||
create_home: false
|
||||
shell: /usr/sbin/nologin
|
||||
state: present
|
||||
|
||||
- name: Download node_exporter binary
|
||||
ansible.builtin.get_url:
|
||||
url: "https://github.com/prometheus/node_exporter/releases/download/v{{ node_exporter_version }}/node_exporter-{{ node_exporter_version }}.linux-amd64.tar.gz"
|
||||
dest: "/tmp/node_exporter-{{ node_exporter_version }}.tar.gz"
|
||||
mode: '0644'
|
||||
register: node_exporter_download
|
||||
|
||||
- name: Extract node_exporter binary
|
||||
ansible.builtin.unarchive:
|
||||
src: "/tmp/node_exporter-{{ node_exporter_version }}.tar.gz"
|
||||
dest: /tmp
|
||||
remote_src: true
|
||||
creates: "/tmp/node_exporter-{{ node_exporter_version }}.linux-amd64"
|
||||
when: node_exporter_download.changed
|
||||
|
||||
- name: Copy node_exporter binary to /usr/local/bin
|
||||
ansible.builtin.copy:
|
||||
src: "/tmp/node_exporter-{{ node_exporter_version }}.linux-amd64/node_exporter"
|
||||
dest: /usr/local/bin/node_exporter
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0755'
|
||||
remote_src: true
|
||||
when: node_exporter_download.changed
|
||||
|
||||
- name: Deploy node_exporter systemd service
|
||||
ansible.builtin.template:
|
||||
src: node_exporter.service.j2
|
||||
dest: /etc/systemd/system/node_exporter.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0644'
|
||||
notify: restart node_exporter
|
||||
|
||||
- name: Enable and start node_exporter service
|
||||
ansible.builtin.systemd:
|
||||
name: node_exporter
|
||||
enabled: "{{ node_exporter_service_enabled }}"
|
||||
state: "{{ node_exporter_service_state }}"
|
||||
daemon_reload: true
|
||||
66
roles/metrics/tasks/victoriametrics.yml
Normal file
66
roles/metrics/tasks/victoriametrics.yml
Normal file
@@ -0,0 +1,66 @@
|
||||
---
|
||||
- name: Create VictoriaMetrics system user
|
||||
ansible.builtin.user:
|
||||
name: "{{ victoriametrics_user }}"
|
||||
system: true
|
||||
create_home: false
|
||||
shell: /usr/sbin/nologin
|
||||
state: present
|
||||
|
||||
- name: Create VictoriaMetrics directories
|
||||
ansible.builtin.file:
|
||||
path: "{{ item }}"
|
||||
state: directory
|
||||
owner: "{{ victoriametrics_user }}"
|
||||
group: "{{ victoriametrics_group }}"
|
||||
mode: '0755'
|
||||
loop:
|
||||
- "{{ victoriametrics_data_dir }}"
|
||||
- "{{ victoriametrics_scrape_config_dir }}"
|
||||
|
||||
- name: Download VictoriaMetrics binary
|
||||
ansible.builtin.get_url:
|
||||
url: "https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v{{ victoriametrics_version }}/victoria-metrics-linux-amd64-v{{ victoriametrics_version }}.tar.gz"
|
||||
dest: "/tmp/victoria-metrics-v{{ victoriametrics_version }}.tar.gz"
|
||||
mode: '0644'
|
||||
register: victoriametrics_download
|
||||
|
||||
- name: Extract VictoriaMetrics binary
|
||||
ansible.builtin.unarchive:
|
||||
src: "/tmp/victoria-metrics-v{{ victoriametrics_version }}.tar.gz"
|
||||
dest: /usr/local/bin
|
||||
remote_src: true
|
||||
creates: /usr/local/bin/victoria-metrics-prod
|
||||
when: victoriametrics_download.changed
|
||||
|
||||
- name: Set VictoriaMetrics binary permissions
|
||||
ansible.builtin.file:
|
||||
path: /usr/local/bin/victoria-metrics-prod
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0755'
|
||||
|
||||
- name: Deploy VictoriaMetrics scrape configuration
|
||||
ansible.builtin.template:
|
||||
src: scrape.yml.j2
|
||||
dest: "{{ victoriametrics_scrape_config_file }}"
|
||||
owner: "{{ victoriametrics_user }}"
|
||||
group: "{{ victoriametrics_group }}"
|
||||
mode: '0644'
|
||||
notify: restart victoriametrics
|
||||
|
||||
- name: Deploy VictoriaMetrics systemd service
|
||||
ansible.builtin.template:
|
||||
src: victoriametrics.service.j2
|
||||
dest: /etc/systemd/system/victoriametrics.service
|
||||
owner: root
|
||||
group: root
|
||||
mode: '0644'
|
||||
notify: restart victoriametrics
|
||||
|
||||
- name: Enable and start VictoriaMetrics service
|
||||
ansible.builtin.systemd:
|
||||
name: victoriametrics
|
||||
enabled: "{{ victoriametrics_service_enabled }}"
|
||||
state: "{{ victoriametrics_service_state }}"
|
||||
daemon_reload: true
|
||||
12
roles/metrics/templates/dashboards.yml.j2
Normal file
12
roles/metrics/templates/dashboards.yml.j2
Normal file
@@ -0,0 +1,12 @@
|
||||
apiVersion: 1
|
||||
|
||||
providers:
|
||||
- name: 'default'
|
||||
orgId: 1
|
||||
folder: ''
|
||||
type: file
|
||||
disableDeletion: false
|
||||
updateIntervalSeconds: 10
|
||||
allowUiUpdates: true
|
||||
options:
|
||||
path: {{ grafana_data_dir }}/dashboards
|
||||
12
roles/metrics/templates/datasource-victoriametrics.yml.j2
Normal file
12
roles/metrics/templates/datasource-victoriametrics.yml.j2
Normal file
@@ -0,0 +1,12 @@
|
||||
apiVersion: 1
|
||||
|
||||
datasources:
|
||||
- name: {{ grafana_datasource_vm_name }}
|
||||
type: prometheus
|
||||
access: proxy
|
||||
url: {{ grafana_datasource_vm_url }}
|
||||
isDefault: true
|
||||
editable: true
|
||||
jsonData:
|
||||
httpMethod: POST
|
||||
timeInterval: 15s
|
||||
26
roles/metrics/templates/grafana.caddy.j2
Normal file
26
roles/metrics/templates/grafana.caddy.j2
Normal file
@@ -0,0 +1,26 @@
|
||||
# Grafana Metrics Dashboard
|
||||
{{ grafana_domain }} {
|
||||
reverse_proxy http://{{ grafana_listen_address }}:{{ grafana_listen_port }} {
|
||||
header_up Host {host}
|
||||
header_up X-Real-IP {remote_host}
|
||||
header_up X-Forwarded-Proto https
|
||||
header_up X-Forwarded-For {remote_host}
|
||||
header_up X-Forwarded-Host {host}
|
||||
}
|
||||
|
||||
# Security headers
|
||||
header {
|
||||
X-Frame-Options SAMEORIGIN
|
||||
X-Content-Type-Options nosniff
|
||||
X-XSS-Protection "1; mode=block"
|
||||
Referrer-Policy strict-origin-when-cross-origin
|
||||
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
|
||||
}
|
||||
|
||||
# Logging
|
||||
log {
|
||||
output file {{ caddy_log_dir }}/grafana.log
|
||||
level INFO
|
||||
format json
|
||||
}
|
||||
}
|
||||
68
roles/metrics/templates/grafana.ini.j2
Normal file
68
roles/metrics/templates/grafana.ini.j2
Normal file
@@ -0,0 +1,68 @@
|
||||
# Grafana Configuration
|
||||
# Managed by Ansible - DO NOT EDIT MANUALLY
|
||||
|
||||
[paths]
|
||||
data = {{ grafana_data_dir }}
|
||||
logs = {{ grafana_logs_dir }}
|
||||
plugins = {{ grafana_plugins_dir }}
|
||||
provisioning = {{ grafana_provisioning_dir }}
|
||||
|
||||
[server]
|
||||
http_addr = {{ grafana_listen_address }}
|
||||
http_port = {{ grafana_listen_port }}
|
||||
domain = {{ grafana_domain }}
|
||||
root_url = {{ grafana_root_url }}
|
||||
enforce_domain = true
|
||||
enable_gzip = true
|
||||
|
||||
[database]
|
||||
type = {{ grafana_database_type }}
|
||||
{% if grafana_database_type == 'sqlite3' %}
|
||||
path = {{ grafana_database_path }}
|
||||
{% endif %}
|
||||
|
||||
[security]
|
||||
admin_user = {{ grafana_admin_user }}
|
||||
admin_password = {{ grafana_admin_password }}
|
||||
secret_key = {{ vault_grafana_secret_key }}
|
||||
cookie_secure = {{ grafana_cookie_secure | lower }}
|
||||
cookie_samesite = {{ grafana_cookie_samesite }}
|
||||
disable_gravatar = true
|
||||
disable_initial_admin_creation = false
|
||||
|
||||
[users]
|
||||
allow_sign_up = {{ grafana_allow_signup | lower }}
|
||||
allow_org_create = false
|
||||
auto_assign_org = true
|
||||
auto_assign_org_role = Viewer
|
||||
|
||||
[auth]
|
||||
disable_login_form = {{ grafana_disable_login_form | lower }}
|
||||
oauth_auto_login = false
|
||||
|
||||
{% if grafana_oauth_enabled %}
|
||||
[auth.generic_oauth]
|
||||
enabled = true
|
||||
name = {{ grafana_oauth_name }}
|
||||
client_id = {{ grafana_oauth_client_id }}
|
||||
client_secret = {{ grafana_oauth_client_secret }}
|
||||
scopes = {{ grafana_oauth_scopes }}
|
||||
auth_url = {{ grafana_oauth_auth_url }}
|
||||
token_url = {{ grafana_oauth_token_url }}
|
||||
api_url = {{ grafana_oauth_api_url }}
|
||||
allow_sign_up = {{ grafana_oauth_allow_sign_up | lower }}
|
||||
role_attribute_path = {{ grafana_oauth_role_attribute_path }}
|
||||
use_pkce = true
|
||||
{% endif %}
|
||||
|
||||
[log]
|
||||
mode = console
|
||||
level = info
|
||||
|
||||
[analytics]
|
||||
reporting_enabled = false
|
||||
check_for_updates = false
|
||||
check_for_plugin_updates = false
|
||||
|
||||
[snapshots]
|
||||
external_enabled = false
|
||||
36
roles/metrics/templates/grafana.service.j2
Normal file
36
roles/metrics/templates/grafana.service.j2
Normal file
@@ -0,0 +1,36 @@
|
||||
[Unit]
|
||||
Description=Grafana visualization platform
|
||||
Documentation=https://grafana.com/docs/
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User={{ grafana_user }}
|
||||
Group={{ grafana_group }}
|
||||
|
||||
WorkingDirectory=/opt/grafana
|
||||
ExecStart=/opt/grafana/bin/grafana-server \
|
||||
--config=/etc/grafana/grafana.ini \
|
||||
--homepath=/opt/grafana
|
||||
|
||||
Restart=on-failure
|
||||
RestartSec=5s
|
||||
|
||||
# Security hardening
|
||||
{% if grafana_systemd_security %}
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ReadWritePaths={{ grafana_data_dir }} {{ grafana_logs_dir }}
|
||||
ProtectKernelTunables=true
|
||||
ProtectKernelModules=true
|
||||
ProtectControlGroups=true
|
||||
RestrictRealtime=true
|
||||
RestrictNamespaces=true
|
||||
LockPersonality=true
|
||||
{% endif %}
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
42
roles/metrics/templates/node_exporter.service.j2
Normal file
42
roles/metrics/templates/node_exporter.service.j2
Normal file
@@ -0,0 +1,42 @@
|
||||
[Unit]
|
||||
Description=Prometheus Node Exporter
|
||||
Documentation=https://github.com/prometheus/node_exporter
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User={{ node_exporter_user }}
|
||||
Group={{ node_exporter_group }}
|
||||
|
||||
ExecStart=/usr/local/bin/node_exporter \
|
||||
--web.listen-address={{ node_exporter_listen_address }} \
|
||||
{% for collector in node_exporter_enabled_collectors %}
|
||||
--collector.{{ collector }} \
|
||||
{% endfor %}
|
||||
{% for collector in node_exporter_disabled_collectors %}
|
||||
--no-collector.{{ collector }} \
|
||||
{% endfor %}
|
||||
--collector.filesystem.fs-types-exclude="{{ node_exporter_filesystem_ignored_fs_types | join('|') }}" \
|
||||
--collector.filesystem.mount-points-exclude="{{ node_exporter_filesystem_ignored_mount_points | join('|') }}"
|
||||
|
||||
Restart=on-failure
|
||||
RestartSec=5s
|
||||
|
||||
# Security hardening
|
||||
{% if node_exporter_systemd_security %}
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ProtectKernelTunables=true
|
||||
ProtectKernelModules=true
|
||||
ProtectControlGroups=true
|
||||
RestrictRealtime=true
|
||||
RestrictNamespaces=true
|
||||
LockPersonality=true
|
||||
ReadOnlyPaths=/
|
||||
{% endif %}
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
22
roles/metrics/templates/scrape.yml.j2
Normal file
22
roles/metrics/templates/scrape.yml.j2
Normal file
@@ -0,0 +1,22 @@
|
||||
global:
|
||||
scrape_interval: {{ victoriametrics_scrape_interval }}
|
||||
scrape_timeout: {{ victoriametrics_scrape_timeout }}
|
||||
external_labels:
|
||||
environment: '{{ "homelab" if inventory_hostname in groups["homelab"] else "production" }}'
|
||||
host: '{{ inventory_hostname }}'
|
||||
|
||||
scrape_configs:
|
||||
# VictoriaMetrics self-monitoring
|
||||
- job_name: 'victoriametrics'
|
||||
static_configs:
|
||||
- targets: ['{{ victoriametrics_listen_address }}']
|
||||
labels:
|
||||
service: 'victoriametrics'
|
||||
|
||||
# Node exporter for system metrics
|
||||
- job_name: 'node'
|
||||
static_configs:
|
||||
- targets: ['{{ node_exporter_listen_address }}']
|
||||
labels:
|
||||
service: 'node_exporter'
|
||||
instance: '{{ inventory_hostname }}'
|
||||
41
roles/metrics/templates/victoriametrics.service.j2
Normal file
41
roles/metrics/templates/victoriametrics.service.j2
Normal file
@@ -0,0 +1,41 @@
|
||||
[Unit]
|
||||
Description=VictoriaMetrics time-series database
|
||||
Documentation=https://docs.victoriametrics.com/
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User={{ victoriametrics_user }}
|
||||
Group={{ victoriametrics_group }}
|
||||
|
||||
ExecStart=/usr/local/bin/victoria-metrics-prod \
|
||||
-storageDataPath={{ victoriametrics_data_dir }} \
|
||||
-retentionPeriod={{ victoriametrics_retention_period }} \
|
||||
-httpListenAddr={{ victoriametrics_listen_address }} \
|
||||
-promscrape.config={{ victoriametrics_scrape_config_file }} \
|
||||
-memory.allowedPercent={{ victoriametrics_memory_allowed_percent }} \
|
||||
-storage.minFreeDiskSpaceBytes={{ victoriametrics_storage_min_free_disk_space_bytes }}
|
||||
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
|
||||
Restart=on-failure
|
||||
RestartSec=5s
|
||||
|
||||
# Security hardening
|
||||
{% if victoriametrics_systemd_security %}
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ReadWritePaths={{ victoriametrics_data_dir }}
|
||||
ProtectKernelTunables=true
|
||||
ProtectKernelModules=true
|
||||
ProtectControlGroups=true
|
||||
RestrictRealtime=true
|
||||
RestrictNamespaces=true
|
||||
LockPersonality=true
|
||||
{% endif %}
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
Reference in New Issue
Block a user