Observability & Monitoring

Set up and use the Beeacon observability stack — Grafana dashboards, Loki log aggregation, and Promtail log collection with built-in PII sanitization.

5 minute read

Optional Component

The Beeacon observability stack is an optional add-on disabled by default. Availability may vary by edition — check your deployment’s config.yml to enable it.

Beeacon Observability Stack

STING includes a built-in observability stack called Beeacon — named after the bee navigation concept. It provides centralized log aggregation, real-time dashboards, and privacy-aware log collection for your entire STING deployment.

Architecture

The Beeacon stack consists of four components, all running as Docker containers alongside the main STING services:

Component	Image	Purpose
Loki	`grafana/loki:3.0.0`	Log aggregation and storage engine
Promtail	Custom (based on `grafana/promtail:3.0.0`)	Log collector with PII sanitization pipeline
Grafana	`grafana/grafana:11.0.0`	Dashboard visualization and querying
Log Forwarder	`alpine:3.18`	Streams container logs to files for Promtail

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│  STING App  │    │   Kratos    │    │  Knowledge  │
│  Chatbot    │    │   Vault     │    │  ChromaDB   │
└──────┬──────┘    └──────┬──────┘    └──────┬──────┘
       │                  │                  │
       └──────────────────┼──────────────────┘
                          │ Docker logs
                   ┌──────▼──────┐
                   │  Promtail   │  ← PII sanitization
                   │  (collect)  │
                   └──────┬──────┘
                          │ Push
                   ┌──────▼──────┐
                   │    Loki     │  ← 7-day retention
                   │  (storage)  │
                   └──────┬──────┘
                          │ Query
                   ┌──────▼──────┐
                   │   Grafana   │  ← 4 dashboards
                   │ (visualize) │
                   └─────────────┘

Enabling the Stack

Beeacon is disabled by default and can be enabled in config.yml:

monitoring:
  observability:
    enabled: true
    grafana:
      enabled: true
    loki:
      enabled: true
    promtail:
      enabled: true

Then regenerate your environment and start the services:

sudo msting regenerate-env
sudo msting start loki
# Wait for Loki to become healthy, then:
sudo msting start promtail grafana log-forwarder

Accessing Grafana

Grafana is exposed on port 3001 by default. If you have a reverse proxy (recommended), configure it at a sub-path:

Access Method	URL
Direct (internal)	`http://localhost:3001/grafana/`
Via reverse proxy	`https://your-domain.com/grafana/`

Anonymous Viewer Access

By default, Grafana allows anonymous read-only access — visitors can view dashboards without logging in. This is ideal for demos and shared monitoring. Admin operations (editing dashboards, managing data sources) require authentication.

Nginx Reverse Proxy

Add this to your nginx configuration to expose Grafana at /grafana/:

location /grafana/ {
    proxy_pass http://127.0.0.1:3001/grafana/;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    # WebSocket support for Grafana Live
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}

Pre-Built Dashboards

Beeacon ships with four dashboards, automatically provisioned in the HIVE folder:

🐝 STING System Overview

The primary operations dashboard showing platform-wide health at a glance.

Service Activity Table — every STING service with event counts and tier classification (application, infrastructure, auth, AI, workers)
Total Events / Error Rate / Warning Rate — stat panels with color-coded thresholds
Log Volume by Service — stacked bar chart showing which services are most active
Log Volume by Level — color-coded (green=INFO, yellow=WARNING, red=ERROR)
Tier Distribution — donut chart breaking down log volume by service tier
Error Log Stream — real-time filtered view of ERROR/CRITICAL/FATAL entries
All Logs — full log explorer with service and level filters

🤖 Bee AI & Reports

Monitors the AI pipeline: chatbot interactions, LLM gateway traffic, and report generation.

Bee Chat / AI Gateway / Reports / AI Errors — stat panels per service
AI Service Activity — per-service timeseries (Chatbot, AI Gateway, LLM Proxy, Demo AI)
Report Worker Pipeline — job lifecycle tracking (started, completed, failed)
Chatbot Logs / AI Gateway Logs — split log panels for debugging

🔒 Authentication & Security

Tracks authentication events and PII compliance.

Auth Events / Failed Attempts / PII Detections / Log Redactions — stat panels
Authentication Events Over Time — login requests, registrations, errors, all Kratos events
Security & PII Events — PII scans, compliance checks, log redactions, app errors
Kratos Auth Logs / Security Event Logs — filtered log streams

📚 Knowledge Service

Monitors Honey Jar operations and vector store activity.

Knowledge Events / Uploads / Searches / ChromaDB Events — stat panels
Knowledge Service Activity — uploads, searches, sync/embedding operations over time
Vector Store Activity — ChromaDB event volume and errors
Knowledge Logs / ChromaDB Logs — service-specific log streams

PII Sanitization in Logs

A key differentiator of Beeacon is automatic PII redaction before logs are stored. Promtail’s pipeline sanitizes the following patterns:

Pattern	Replacement
Email addresses	`[EMAIL_REDACTED]`
Phone numbers	`[PHONE_REDACTED]`
SSN patterns	`[SSN_REDACTED]`
Credit card numbers	`[CC_REDACTED]`
API keys (`sk_...`)	`[API_KEY_REDACTED]`
Bearer tokens	`Bearer [TOKEN_REDACTED]`
Passwords in logs	`[PASSWORD_REDACTED]`

This ensures that even if application code accidentally logs sensitive data, it never reaches persistent storage.

Log Labels and Querying

Promtail automatically labels every log entry with metadata from Docker:

Label	Description	Example Values
`service`	Docker Compose service name	`app`, `chatbot`, `kratos`, `knowledge`
`container`	Docker container name	`sting-ce-app`, `sting-ce-chatbot`
`tier`	Service category	`application`, `infrastructure`, `auth`, `ai`, `workers`
`project`	Compose project	`sting-ce`
`level`	Log level (when parseable)	`INFO`, `WARNING`, `ERROR`, `CRITICAL`

Example LogQL Queries

# All errors across the platform
{project="sting-ce", level=~"ERROR|CRITICAL"}

# Chatbot activity
{service="chatbot"}

# Authentication failures
{service="kratos"} |~ "level=error|failed|denied"

# Report generation events
{service="report-worker"} |~ "processed|completed|Processing"

# PII-related events in the app
{service="app"} |~ "pii|compliance"

# All logs from a specific tier
{tier="ai"}

Resource Usage

The Beeacon stack is designed to be lightweight:

Component	Memory Limit	CPU Limit	Typical Usage
Loki	512 MB	0.5 cores	~100-200 MB
Promtail	256 MB	0.25 cores	~60-80 MB
Grafana	512 MB	0.5 cores	~100-150 MB
Log Forwarder	256 MB	0.1 cores	~10-20 MB
Total	1.5 GB	1.35 cores	~300-450 MB

Configuration Reference

Loki

Stored at observability/loki/config/loki.yml:

Retention: 7 days (configurable via limits_config.retention_period)
Storage: Local filesystem at /loki/chunks
Schema: TSDB v13 with 24h index periods
Rate limits: 4 MB/s ingestion, 6 MB/s burst, 256 KB max line size

Promtail

Stored at observability/promtail/config/promtail.yml:

Collection: Docker socket discovery, auto-discovers all sting-ce containers
Pipeline: JSON parsing → level extraction → PII sanitization → health check filtering
Refresh: 15-second container discovery interval

Grafana

Stored at observability/grafana/config/grafana.ini:

Sub-path: Served at /grafana/ for reverse proxy compatibility
Auth: Anonymous viewer access enabled, admin login available
Provisioning: Dashboards and Loki datasource auto-provisioned from files
Security: Embedding allowed, HSTS enabled, analytics/telemetry disabled

Troubleshooting

Services not appearing in dashboards

Promtail discovers containers via Docker socket. Verify it can access the socket:

sudo docker logs sting-ce-promtail 2>&1 | tail -20

Loki showing “too many outstanding requests”

Reduce query parallelism or increase limits in loki.yml:

limits_config:
  max_query_parallelism: 4
  max_query_series: 10000

Grafana shows “No Data”

Verify Loki has data: curl -s http://localhost:3100/loki/api/v1/labels
Check the time range — ensure it covers the period when logs were collected
Verify the dashboard’s datasource is pointing to Loki

Checking log flow

# Verify Loki is receiving data
curl -s http://localhost:3100/loki/api/v1/label/service/values

# Check Promtail targets
curl -s http://localhost:9080/targets

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last updated: March 14, 2026