Observability & Monitoring
5 minute read
Optional Component
The Beeacon observability stack is an optional add-on disabled by default. Availability may vary by edition — check your deployment’sconfig.yml to enable it.Beeacon Observability Stack
STING includes a built-in observability stack called Beeacon — named after the bee navigation concept. It provides centralized log aggregation, real-time dashboards, and privacy-aware log collection for your entire STING deployment.
Architecture
The Beeacon stack consists of four components, all running as Docker containers alongside the main STING services:
| Component | Image | Purpose |
|---|---|---|
| Loki | grafana/loki:3.0.0 | Log aggregation and storage engine |
| Promtail | Custom (based on grafana/promtail:3.0.0) | Log collector with PII sanitization pipeline |
| Grafana | grafana/grafana:11.0.0 | Dashboard visualization and querying |
| Log Forwarder | alpine:3.18 | Streams container logs to files for Promtail |
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ STING App │ │ Kratos │ │ Knowledge │
│ Chatbot │ │ Vault │ │ ChromaDB │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└──────────────────┼──────────────────┘
│ Docker logs
┌──────▼──────┐
│ Promtail │ ← PII sanitization
│ (collect) │
└──────┬──────┘
│ Push
┌──────▼──────┐
│ Loki │ ← 7-day retention
│ (storage) │
└──────┬──────┘
│ Query
┌──────▼──────┐
│ Grafana │ ← 4 dashboards
│ (visualize) │
└─────────────┘
Enabling the Stack
Beeacon is disabled by default and can be enabled in config.yml:
monitoring:
observability:
enabled: true
grafana:
enabled: true
loki:
enabled: true
promtail:
enabled: true
Then regenerate your environment and start the services:
sudo msting regenerate-env
sudo msting start loki
# Wait for Loki to become healthy, then:
sudo msting start promtail grafana log-forwarder
Accessing Grafana
Grafana is exposed on port 3001 by default. If you have a reverse proxy (recommended), configure it at a sub-path:
| Access Method | URL |
|---|---|
| Direct (internal) | http://localhost:3001/grafana/ |
| Via reverse proxy | https://your-domain.com/grafana/ |
Anonymous Viewer Access
By default, Grafana allows anonymous read-only access — visitors can view dashboards without logging in. This is ideal for demos and shared monitoring. Admin operations (editing dashboards, managing data sources) require authentication.
Nginx Reverse Proxy
Add this to your nginx configuration to expose Grafana at /grafana/:
location /grafana/ {
proxy_pass http://127.0.0.1:3001/grafana/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support for Grafana Live
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
Pre-Built Dashboards
Beeacon ships with four dashboards, automatically provisioned in the HIVE folder:
🐝 STING System Overview
The primary operations dashboard showing platform-wide health at a glance.
- Service Activity Table — every STING service with event counts and tier classification (application, infrastructure, auth, AI, workers)
- Total Events / Error Rate / Warning Rate — stat panels with color-coded thresholds
- Log Volume by Service — stacked bar chart showing which services are most active
- Log Volume by Level — color-coded (green=INFO, yellow=WARNING, red=ERROR)
- Tier Distribution — donut chart breaking down log volume by service tier
- Error Log Stream — real-time filtered view of ERROR/CRITICAL/FATAL entries
- All Logs — full log explorer with service and level filters
🤖 Bee AI & Reports
Monitors the AI pipeline: chatbot interactions, LLM gateway traffic, and report generation.
- Bee Chat / AI Gateway / Reports / AI Errors — stat panels per service
- AI Service Activity — per-service timeseries (Chatbot, AI Gateway, LLM Proxy, Demo AI)
- Report Worker Pipeline — job lifecycle tracking (started, completed, failed)
- Chatbot Logs / AI Gateway Logs — split log panels for debugging
🔒 Authentication & Security
Tracks authentication events and PII compliance.
- Auth Events / Failed Attempts / PII Detections / Log Redactions — stat panels
- Authentication Events Over Time — login requests, registrations, errors, all Kratos events
- Security & PII Events — PII scans, compliance checks, log redactions, app errors
- Kratos Auth Logs / Security Event Logs — filtered log streams
📚 Knowledge Service
Monitors Honey Jar operations and vector store activity.
- Knowledge Events / Uploads / Searches / ChromaDB Events — stat panels
- Knowledge Service Activity — uploads, searches, sync/embedding operations over time
- Vector Store Activity — ChromaDB event volume and errors
- Knowledge Logs / ChromaDB Logs — service-specific log streams
PII Sanitization in Logs
A key differentiator of Beeacon is automatic PII redaction before logs are stored. Promtail’s pipeline sanitizes the following patterns:
| Pattern | Replacement |
|---|---|
| Email addresses | [EMAIL_REDACTED] |
| Phone numbers | [PHONE_REDACTED] |
| SSN patterns | [SSN_REDACTED] |
| Credit card numbers | [CC_REDACTED] |
API keys (sk_...) | [API_KEY_REDACTED] |
| Bearer tokens | Bearer [TOKEN_REDACTED] |
| Passwords in logs | [PASSWORD_REDACTED] |
This ensures that even if application code accidentally logs sensitive data, it never reaches persistent storage.
Log Labels and Querying
Promtail automatically labels every log entry with metadata from Docker:
| Label | Description | Example Values |
|---|---|---|
service | Docker Compose service name | app, chatbot, kratos, knowledge |
container | Docker container name | sting-ce-app, sting-ce-chatbot |
tier | Service category | application, infrastructure, auth, ai, workers |
project | Compose project | sting-ce |
level | Log level (when parseable) | INFO, WARNING, ERROR, CRITICAL |
Example LogQL Queries
# All errors across the platform
{project="sting-ce", level=~"ERROR|CRITICAL"}
# Chatbot activity
{service="chatbot"}
# Authentication failures
{service="kratos"} |~ "level=error|failed|denied"
# Report generation events
{service="report-worker"} |~ "processed|completed|Processing"
# PII-related events in the app
{service="app"} |~ "pii|compliance"
# All logs from a specific tier
{tier="ai"}
Resource Usage
The Beeacon stack is designed to be lightweight:
| Component | Memory Limit | CPU Limit | Typical Usage |
|---|---|---|---|
| Loki | 512 MB | 0.5 cores | ~100-200 MB |
| Promtail | 256 MB | 0.25 cores | ~60-80 MB |
| Grafana | 512 MB | 0.5 cores | ~100-150 MB |
| Log Forwarder | 256 MB | 0.1 cores | ~10-20 MB |
| Total | 1.5 GB | 1.35 cores | ~300-450 MB |
Configuration Reference
Loki
Stored at observability/loki/config/loki.yml:
- Retention: 7 days (configurable via
limits_config.retention_period) - Storage: Local filesystem at
/loki/chunks - Schema: TSDB v13 with 24h index periods
- Rate limits: 4 MB/s ingestion, 6 MB/s burst, 256 KB max line size
Promtail
Stored at observability/promtail/config/promtail.yml:
- Collection: Docker socket discovery, auto-discovers all
sting-cecontainers - Pipeline: JSON parsing → level extraction → PII sanitization → health check filtering
- Refresh: 15-second container discovery interval
Grafana
Stored at observability/grafana/config/grafana.ini:
- Sub-path: Served at
/grafana/for reverse proxy compatibility - Auth: Anonymous viewer access enabled, admin login available
- Provisioning: Dashboards and Loki datasource auto-provisioned from files
- Security: Embedding allowed, HSTS enabled, analytics/telemetry disabled
Troubleshooting
Services not appearing in dashboards
Promtail discovers containers via Docker socket. Verify it can access the socket:
sudo docker logs sting-ce-promtail 2>&1 | tail -20
Loki showing “too many outstanding requests”
Reduce query parallelism or increase limits in loki.yml:
limits_config:
max_query_parallelism: 4
max_query_series: 10000
Grafana shows “No Data”
- Verify Loki has data:
curl -s http://localhost:3100/loki/api/v1/labels - Check the time range — ensure it covers the period when logs were collected
- Verify the dashboard’s datasource is pointing to Loki
Checking log flow
# Verify Loki is receiving data
curl -s http://localhost:3100/loki/api/v1/label/service/values
# Check Promtail targets
curl -s http://localhost:9080/targets
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.