Vector Search & ChromaDB
STING uses ChromaDB as its vector database for semantic search across Honey Jars. When documents are uploaded to a Honey Jar, they are chunked, embedded into vector representations, and stored in ChromaDB — enabling Bee AI to find relevant information based on meaning, not just keywords.
Architecture
Document Upload → Knowledge Service → Chunking → Embedding → ChromaDB
↑
AI Gateway
(embedding model)
User Query → Bee AI → Semantic Search → ChromaDB → Ranked Results
Components
| Component | Role |
|---|---|
Knowledge Service (knowledge_service/) | FastAPI service managing Honey Jars, document processing, and search |
ChromaDB (chromadb:0.5.20) | Vector database storing embeddings with metadata |
| AI Gateway | Provides embedding model access (e.g., nomic-embed-text) |
| PostgreSQL | Stores document metadata, Honey Jar configs, access controls |
How It Works
1. Document Ingestion
When a document is uploaded to a Honey Jar:
- The Knowledge Service extracts text from the document (PDF, DOCX, TXT, etc.)
- Text is split into overlapping chunks (default: ~500 tokens with 50-token overlap)
- Each chunk is sent to the embedding model via the AI Gateway
- The resulting vector + metadata is stored in a ChromaDB collection (one per Honey Jar)
2. Semantic Search
When Bee AI or a user searches a Honey Jar:
- The query text is embedded using the same model
- ChromaDB performs approximate nearest-neighbor (ANN) search
- Results are ranked by cosine similarity
- Top-k results are returned with relevance scores and source metadata
3. Embedding Models
| Model | Provider | Dimensions | Best For |
|---|---|---|---|
nomic-embed-text | Ollama (local) | 768 | General-purpose, privacy-first |
text-embedding-ada-002 | OpenAI | 1536 | High quality, cloud-based |
all-MiniLM-L6-v2 | Local | 384 | Lightweight, fast |
The default model is configured via the AI Gateway. For air-gapped deployments, use nomic-embed-text with a local Ollama instance.
Configuration
ChromaDB Settings
ChromaDB runs as a Docker container with persistent storage:
# In docker-compose.yml
chroma:
image: chromadb/chroma:0.5.20
volumes:
- chroma_data:/chroma/chroma
ports:
- "8000:8000"
| Setting | Default | Description |
|---|---|---|
| Port | 8000 | ChromaDB API port (internal) |
| Storage | Docker volume chroma_data | Persistent vector storage |
| Memory limit | 512 MB | Container memory cap |
Knowledge Service Settings
Configured via env/knowledge.env:
| Variable | Description |
|---|---|
CHROMA_HOST | ChromaDB hostname (default: chroma) |
CHROMA_PORT | ChromaDB port (default: 8000) |
EMBEDDING_MODEL | Model name for embeddings |
CHUNK_SIZE | Document chunk size in tokens |
CHUNK_OVERLAP | Overlap between chunks |
Honey Jar Collections
Each Honey Jar maps to a ChromaDB collection. Collections are isolated — searches within one Honey Jar don’t return results from another (unless explicitly configured for cross-jar search).
Collection Metadata
Each stored chunk includes:
- Document source — filename, upload date, uploader
- Position — chunk index within the document
- Honey Jar ID — which jar this belongs to
- Access level — public, private, or restricted
Troubleshooting
ChromaDB container unhealthy
# Check ChromaDB health
curl -s http://localhost:8000/api/v1/heartbeat
# Check logs
sudo docker logs sting-ce-chroma --tail 50
Search returning poor results
- Verify the embedding model is loaded and accessible via the AI Gateway
- Check that documents were fully ingested (not partially processed)
- Try a more specific query — vector search works best with natural language questions
Re-indexing a Honey Jar
If embeddings are corrupted or you’ve changed the embedding model:
# Via the Knowledge Service API
curl -X POST https://localhost:5050/api/knowledge/jars/{jar_id}/reindex \
-H "Authorization: Bearer <api-key>"
Checking collection stats
# List ChromaDB collections
curl -s http://localhost:8000/api/v1/collections | python3 -m json.tool