Vector Search & ChromaDB

STING uses ChromaDB as its vector database for semantic search across Honey Jars. When documents are uploaded to a Honey Jar, they are chunked, embedded into vector representations, and stored in ChromaDB — enabling Bee AI to find relevant information based on meaning, not just keywords.

Architecture

Document Upload → Knowledge Service → Chunking → Embedding → ChromaDB
                                                     ↑
                                              AI Gateway
                                          (embedding model)

User Query → Bee AI → Semantic Search → ChromaDB → Ranked Results

Components

ComponentRole
Knowledge Service (knowledge_service/)FastAPI service managing Honey Jars, document processing, and search
ChromaDB (chromadb:0.5.20)Vector database storing embeddings with metadata
AI GatewayProvides embedding model access (e.g., nomic-embed-text)
PostgreSQLStores document metadata, Honey Jar configs, access controls

How It Works

1. Document Ingestion

When a document is uploaded to a Honey Jar:

  1. The Knowledge Service extracts text from the document (PDF, DOCX, TXT, etc.)
  2. Text is split into overlapping chunks (default: ~500 tokens with 50-token overlap)
  3. Each chunk is sent to the embedding model via the AI Gateway
  4. The resulting vector + metadata is stored in a ChromaDB collection (one per Honey Jar)

When Bee AI or a user searches a Honey Jar:

  1. The query text is embedded using the same model
  2. ChromaDB performs approximate nearest-neighbor (ANN) search
  3. Results are ranked by cosine similarity
  4. Top-k results are returned with relevance scores and source metadata

3. Embedding Models

ModelProviderDimensionsBest For
nomic-embed-textOllama (local)768General-purpose, privacy-first
text-embedding-ada-002OpenAI1536High quality, cloud-based
all-MiniLM-L6-v2Local384Lightweight, fast

The default model is configured via the AI Gateway. For air-gapped deployments, use nomic-embed-text with a local Ollama instance.

Configuration

ChromaDB Settings

ChromaDB runs as a Docker container with persistent storage:

# In docker-compose.yml
chroma:
  image: chromadb/chroma:0.5.20
  volumes:
    - chroma_data:/chroma/chroma
  ports:
    - "8000:8000"
SettingDefaultDescription
Port8000ChromaDB API port (internal)
StorageDocker volume chroma_dataPersistent vector storage
Memory limit512 MBContainer memory cap

Knowledge Service Settings

Configured via env/knowledge.env:

VariableDescription
CHROMA_HOSTChromaDB hostname (default: chroma)
CHROMA_PORTChromaDB port (default: 8000)
EMBEDDING_MODELModel name for embeddings
CHUNK_SIZEDocument chunk size in tokens
CHUNK_OVERLAPOverlap between chunks

Honey Jar Collections

Each Honey Jar maps to a ChromaDB collection. Collections are isolated — searches within one Honey Jar don’t return results from another (unless explicitly configured for cross-jar search).

Collection Metadata

Each stored chunk includes:

  • Document source — filename, upload date, uploader
  • Position — chunk index within the document
  • Honey Jar ID — which jar this belongs to
  • Access level — public, private, or restricted

Troubleshooting

ChromaDB container unhealthy

# Check ChromaDB health
curl -s http://localhost:8000/api/v1/heartbeat

# Check logs
sudo docker logs sting-ce-chroma --tail 50

Search returning poor results

  • Verify the embedding model is loaded and accessible via the AI Gateway
  • Check that documents were fully ingested (not partially processed)
  • Try a more specific query — vector search works best with natural language questions

Re-indexing a Honey Jar

If embeddings are corrupted or you’ve changed the embedding model:

# Via the Knowledge Service API
curl -X POST https://localhost:5050/api/knowledge/jars/{jar_id}/reindex \
  -H "Authorization: Bearer <api-key>"

Checking collection stats

# List ChromaDB collections
curl -s http://localhost:8000/api/v1/collections | python3 -m json.tool

Last updated: