Skip to content

feat(observability): add VM-based observability stack#39

Merged
JavierGi merged 1 commit into
mainfrom
feat/observability-dashboards
May 22, 2026
Merged

feat(observability): add VM-based observability stack#39
JavierGi merged 1 commit into
mainfrom
feat/observability-dashboards

Conversation

@JavierGi
Copy link
Copy Markdown
Contributor

Summary

  • Adds vm-observability/ under charts/countly-observability/ with the full Docker Compose stack for running the observability backend on a standalone VM
  • Includes Caddy (TLS termination + reverse proxy), Prometheus, Loki, Tempo, Pyroscope, and Grafana services with all their config files
  • Google OAuth credentials are now read from GF_AUTH_GOOGLE_CLIENT_ID / GF_AUTH_GOOGLE_CLIENT_SECRET env vars (set these in a .env file on the VM)
  • Source of truth moved from countly-deployment repo (removed there in a companion commit)

Docker Compose stack with Caddy, Prometheus, Loki, Tempo, Pyroscope,
and Grafana for running the observability stack on a standalone VM
instead of Kubernetes, including all service configs and setup script.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@JavierGi JavierGi merged commit 56b22f4 into main May 22, 2026
5 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a VM-deployable “observability backend” stack (Docker Compose + configs + bootstrap script) under charts/countly-observability/vm-observability/, intended to run Prometheus/Loki/Tempo/Pyroscope/Grafana behind Caddy with TLS and accept telemetry from external agents.

Changes:

  • Added setup.sh to provision a data disk, install Docker, copy uploaded stack files, and start the Compose stack.
  • Added a Docker Compose definition wiring Caddy, Prometheus, Loki, Tempo, Pyroscope, and Grafana (including Google OAuth via env vars).
  • Added service configuration files and Grafana provisioning for datasources/dashboards.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
charts/countly-observability/vm-observability/setup.sh VM bootstrap script for disk/data directory setup and stack startup
charts/countly-observability/vm-observability/docker-compose.yml Compose stack definition for observability services
charts/countly-observability/vm-observability/configs/Caddyfile TLS + reverse proxy routing for Grafana and ingest endpoints
charts/countly-observability/vm-observability/configs/prometheus.yml Prometheus configured as remote_write receiver
charts/countly-observability/vm-observability/configs/loki.yaml Loki single-node filesystem config
charts/countly-observability/vm-observability/configs/tempo.yaml Tempo OTLP ingest + local storage + remote_write for generated metrics
charts/countly-observability/vm-observability/configs/pyroscope.yaml Pyroscope local storage + HTTP config
charts/countly-observability/vm-observability/configs/grafana/provisioning/datasources/datasources.yaml Grafana datasource provisioning (Prometheus/Loki/Tempo/Pyroscope)
charts/countly-observability/vm-observability/configs/grafana/provisioning/dashboards/provider.yaml Grafana dashboard provider configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +5 to +14
if ! mountpoint -q /data; then
sudo mkfs.ext4 -F /dev/sdb
sudo mkdir -p /data
sudo mount /dev/sdb /data
echo '/dev/sdb /data ext4 defaults 0 2' | sudo tee -a /etc/fstab
echo "Disk mounted at /data"
else
echo "Already mounted"
fi

Comment on lines +38 to +39
sudo mkdir -p /opt/observability
sudo cp -r /tmp/obs-upload/* /opt/observability/
/data/pyroscope
sudo chown -R 10001:10001 /data/loki
sudo chown -R 472:472 /data/grafana
sudo chown -R 65534:65534 /data/prometheus
Comment on lines +70 to +92
grafana:
image: grafana/grafana:12.1.0
restart: unless-stopped
environment:
GF_SERVER_DOMAIN: obs-newarch.count.ly
GF_SERVER_ROOT_URL: https://obs-newarch.count.ly
GF_AUTH_GOOGLE_ENABLED: "true"
GF_AUTH_GOOGLE_CLIENT_ID: "${GF_AUTH_GOOGLE_CLIENT_ID}"
GF_AUTH_GOOGLE_CLIENT_SECRET: "${GF_AUTH_GOOGLE_CLIENT_SECRET}"
GF_AUTH_GOOGLE_SCOPES: "https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email"
GF_AUTH_GOOGLE_AUTH_URL: "https://accounts.google.com/o/oauth2/v2/auth"
GF_AUTH_GOOGLE_TOKEN_URL: "https://oauth2.googleapis.com/token"
GF_AUTH_GOOGLE_API_URL: "https://openidconnect.googleapis.com/v1/userinfo"
GF_AUTH_GOOGLE_USE_PKCE: "true"
GF_INSTALL_PLUGINS: "grafana-pyroscope-datasource"
GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS: "grafana-pyroscope-app,grafana-lokiexplore-app,grafana-exploretraces-app"
GF_USERS_ALLOW_SIGN_UP: "false"
GF_USERS_ALLOW_ORG_CREATE: "false"
GF_USERS_AUTO_ASSIGN_ORG_ROLE: "Editor"
GF_AUTH_ANONYMOUS_ENABLED: "false"
GF_SECURITY_ALLOW_EMBEDDING: "true"
GF_FEATURE_TOGGLES_ENABLE: "tempoSearch,tempoBackendSearch,traceqlEditor,exploreTraces"
volumes:
Comment on lines +54 to +55
- "4317:4317" # OTLP gRPC (plain, no TLS — matches cluster alloy config)
- "4318:4318" # OTLP HTTP
Comment on lines +36 to +72
loki:
image: grafana/loki:3.5.3
restart: unless-stopped
command: -config.file=/etc/loki/loki.yaml
volumes:
- ./configs/loki.yaml:/etc/loki/loki.yaml:ro
- /data/loki:/data/loki
expose:
- "3100"

tempo:
image: grafana/tempo:2.8.1
restart: unless-stopped
command: -config.file=/etc/tempo/tempo.yaml
volumes:
- ./configs/tempo.yaml:/etc/tempo/tempo.yaml:ro
- /data/tempo:/data/tempo
ports:
- "4317:4317" # OTLP gRPC (plain, no TLS — matches cluster alloy config)
- "4318:4318" # OTLP HTTP
expose:
- "3200"

pyroscope:
image: grafana/pyroscope:1.16.0
restart: unless-stopped
command:
- -config.file=/etc/pyroscope/pyroscope.yaml
volumes:
- ./configs/pyroscope.yaml:/etc/pyroscope/pyroscope.yaml:ro
- /data/pyroscope:/data/pyroscope
expose:
- "4040"

grafana:
image: grafana/grafana:12.1.0
restart: unless-stopped
Comment on lines +3 to +22
# Prometheus remote write (from cluster Alloy agents)
handle /api/v1/write {
reverse_proxy prometheus:9090
}

# Loki push + query
handle /loki/* {
reverse_proxy loki:3100
}

# Pyroscope ingest
handle /ingest* {
reverse_proxy pyroscope:4040
}
handle /push.v1.* {
reverse_proxy pyroscope:4040
}
handle /querier.v1.* {
reverse_proxy pyroscope:4040
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants