Vector Search & ChromaDB

STING uses ChromaDB as its vector database for semantic search across Honey Jars. When documents are uploaded to a Honey Jar, they are chunked, embedded into vector representations, and stored in ChromaDB — enabling Bee AI to find relevant information based on meaning, not just keywords.

Architecture

Document Upload → Knowledge Service → Chunking → Embedding → ChromaDB
                                                     ↑
                                              AI Gateway
                                          (embedding model)

User Query → Bee AI → Semantic Search → ChromaDB → Ranked Results

Components

Component	Role
Knowledge Service (`knowledge_service/`)	FastAPI service managing Honey Jars, document processing, and search
ChromaDB (`chromadb:0.5.20`)	Vector database storing embeddings with metadata
AI Gateway	Provides embedding model access (e.g., `nomic-embed-text`)
PostgreSQL	Stores document metadata, Honey Jar configs, access controls

How It Works

1. Document Ingestion

When a document is uploaded to a Honey Jar:

The Knowledge Service extracts text from the document (PDF, DOCX, TXT, etc.)
Text is split into overlapping chunks (default: ~500 tokens with 50-token overlap)
Each chunk is sent to the embedding model via the AI Gateway
The resulting vector + metadata is stored in a ChromaDB collection (one per Honey Jar)

2. Semantic Search

When Bee AI or a user searches a Honey Jar:

The query text is embedded using the same model
ChromaDB performs approximate nearest-neighbor (ANN) search
Results are ranked by cosine similarity
Top-k results are returned with relevance scores and source metadata

3. Embedding Models

Model	Provider	Dimensions	Best For
`nomic-embed-text`	Ollama (local)	768	General-purpose, privacy-first
`text-embedding-ada-002`	OpenAI	1536	High quality, cloud-based
`all-MiniLM-L6-v2`	Local	384	Lightweight, fast

The default model is configured via the AI Gateway. For air-gapped deployments, use nomic-embed-text with a local Ollama instance.

Configuration

ChromaDB Settings

ChromaDB runs as a Docker container with persistent storage:

# In docker-compose.yml
chroma:
  image: chromadb/chroma:0.5.20
  volumes:
    - chroma_data:/chroma/chroma
  ports:
    - "8000:8000"

Setting	Default	Description
Port	8000	ChromaDB API port (internal)
Storage	Docker volume `chroma_data`	Persistent vector storage
Memory limit	512 MB	Container memory cap

Knowledge Service Settings

Configured via env/knowledge.env:

Variable	Description
`CHROMA_HOST`	ChromaDB hostname (default: `chroma`)
`CHROMA_PORT`	ChromaDB port (default: `8000`)
`EMBEDDING_MODEL`	Model name for embeddings
`CHUNK_SIZE`	Document chunk size in tokens
`CHUNK_OVERLAP`	Overlap between chunks

Honey Jar Collections

Each Honey Jar maps to a ChromaDB collection. Collections are isolated — searches within one Honey Jar don’t return results from another (unless explicitly configured for cross-jar search).

Collection Metadata

Each stored chunk includes:

Document source — filename, upload date, uploader
Position — chunk index within the document
Honey Jar ID — which jar this belongs to
Access level — public, private, or restricted

Troubleshooting

ChromaDB container unhealthy

# Check ChromaDB health
curl -s http://localhost:8000/api/v1/heartbeat

# Check logs
sudo docker logs sting-ce-chroma --tail 50

Search returning poor results

Verify the embedding model is loaded and accessible via the AI Gateway
Check that documents were fully ingested (not partially processed)
Try a more specific query — vector search works best with natural language questions

Re-indexing a Honey Jar

If embeddings are corrupted or you’ve changed the embedding model:

# Via the Knowledge Service API
curl -X POST https://localhost:5050/api/knowledge/jars/{jar_id}/reindex \
  -H "Authorization: Bearer <api-key>"

Checking collection stats

# List ChromaDB collections
curl -s http://localhost:8000/api/v1/collections | python3 -m json.tool

Last updated: March 14, 2026