Ollama Model Setup Guide

Overview

STING uses Ollama for local AI capabilities, including the Bee chat assistant. This guide helps you set up the required models.

For system requirements and initial installation, see the STING Platform Installation Guide.

ollama list

If you get an error, start Ollama:

ollama serve

For general use (including Bee chat):

# Recommended - Latest Llama model with good performance
ollama pull llama3.3

# Alternative lightweight option
ollama pull phi3

For code-related tasks:

# Excellent for code analysis and generation
ollama pull deepseek-coder-v2

ollama list

You should see your installed models listed.

Model	Size	Use Case	Performance
llama3.3:latest	~5GB	General chat, analysis	Excellent
phi3:mini	~2GB	Lightweight chat	Good
deepseek-coder-v2:latest	~16GB	Code tasks	Excellent for code

Models can be large. Consider:

This usually means no models are installed. The service is running but has no AI model to use.

The default model is configured in /conf/config.yml:

external_ai:
  providers:
    ollama:
      defaultModel: "llama3.3:latest"

After changing the configuration:

./manage_sting.sh restart external-ai

For Apple Silicon and NVIDIA GPU acceleration, see the Hardware Acceleration Guide.

After installing models:

Last updated: October 22, 2025