LLM Provider Configuration¶

WikiMind supports multiple LLM providers with automatic selection, fallback, and cost tracking.

Supported Providers¶

Provider	Default Model	API Key Variable	Notes
Anthropic	claude-sonnet-4-5	`ANTHROPIC_API_KEY`	Default provider
OpenAI	gpt-4o	`OPENAI_API_KEY`
Google Gemini	gemini-2.0-flash	`GOOGLE_API_KEY`
Ollama	llama3.2	--	Local, no key needed
Mock	mock-1	--	CI/testing only

Auto-Enable¶

Providers are auto-enabled when their API key is detected. You do not need to manually set ENABLED=true. Just set the key:

# This is all you need -- Anthropic auto-enables
ANTHROPIC_API_KEY=sk-ant-...

The auto-enable logic checks three sources:

Environment variable (with or without WIKIMIND_ prefix)
.env file
OS keychain (via keyring)

To explicitly disable a provider whose key is set:

WIKIMIND_LLM__ANTHROPIC__ENABLED=false

Provider Selection¶

The LLM router selects providers in this order:

Preferred provider -- If the request specifies one
Default provider -- Configured via WIKIMIND_LLM__DEFAULT_PROVIDER
Fallback providers -- All other enabled providers, in order

flowchart TD
    A["LLM Request"] --> B{Preferred<br/>provider?}
    B -- Yes --> C{Available?}
    B -- No --> D["Default provider"]
    C -- Yes --> E["Call provider"]
    C -- No --> D
    D --> F{Available?}
    F -- Yes --> E
    F -- No --> G{Fallback<br/>enabled?}
    G -- Yes --> H["Try next"]
    G -- No --> I["Error"]
    H --> F
    E --> J{Success?}
    J -- Yes --> K["Return response"]
    J -- No --> G

Fallback¶

When WIKIMIND_LLM__FALLBACK_ENABLED=true (default), failed providers are skipped and the next available provider is tried. When disabled, the first failure raises an error.

Model Override¶

Override the default model for any provider:

WIKIMIND_LLM__ANTHROPIC__MODEL=claude-haiku-4-5-20251001
WIKIMIND_LLM__OPENAI__MODEL=gpt-4o-mini
WIKIMIND_LLM__GOOGLE__MODEL=gemini-2.0-flash
WIKIMIND_LLM__OLLAMA__MODEL=mistral

Ollama (Local Models)¶

Ollama does not require an API key but must be explicitly enabled:

WIKIMIND_LLM__OLLAMA__ENABLED=true
WIKIMIND_LLM__DEFAULT_PROVIDER=ollama
WIKIMIND_LLM__OLLAMA__MODEL=llama3.2
WIKIMIND_LLM__OLLAMA_BASE_URL=http://localhost:11434

Make sure Ollama is running locally with the model pulled:

ollama pull llama3.2
ollama serve

Cost Tracking¶

Every LLM call is logged with:

Provider and model used
Input and output token counts
Calculated USD cost
Latency in milliseconds
Task type (compile, qa, ingest, lint)

Pricing¶

Current pricing (USD per 1M tokens):

Provider	Model	Input	Output
Anthropic	claude-sonnet-4-5	$3.00	$15.00
Anthropic	claude-haiku-4-5-20251001	$0.80	$4.00
OpenAI	gpt-4o	$2.50	$10.00
OpenAI	gpt-4o-mini	$0.15	$0.60
Google	gemini-2.0-flash	$0.10	$0.40
Ollama	*	$0.00	$0.00

Budget Management¶

Set a monthly spending ceiling:

WIKIMIND_LLM__MONTHLY_BUDGET_USD=50.0
WIKIMIND_LLM__BUDGET_WARNING_PCT=0.8

When spending reaches 80% of the budget, a WebSocket budget.warning event is emitted. At 100%, a budget.exceeded event fires. These are displayed in the UI.

View current spending via the API:

curl http://localhost:7842/settings/costs

Mock Provider (Testing)¶

For CI and local testing without real API calls:

WIKIMIND_LLM__MOCK__ENABLED=true
WIKIMIND_LLM__DEFAULT_PROVIDER=mock

The mock provider returns deterministic canned responses for all task types. It never makes network calls and costs nothing.

Do not use in production

The mock provider returns static responses that do not reflect your actual source content.

API Key Storage¶

API keys can be stored in three locations (checked in order):

Environment variable -- ANTHROPIC_API_KEY=sk-ant-... (with or without WIKIMIND_ prefix)
.env file -- Loaded automatically by Pydantic BaseSettings
OS keychain -- Stored via the keyring library

Store a key in the keychain:

import keyring
keyring.set_password("wikimind", "anthropic", "sk-ant-...")

Keys stored as SecretStr are never logged in plaintext.

Runtime Preferences¶

Some LLM settings can be changed at runtime without restarting:

# Change default provider
curl -X PUT http://localhost:7842/settings/preferences \
  -H "Content-Type: application/json" \
  -d '{"key": "llm.default_provider", "value": "openai"}'

# Change monthly budget
curl -X PUT http://localhost:7842/settings/preferences \
  -H "Content-Type: application/json" \
  -d '{"key": "llm.monthly_budget_usd", "value": "100.0"}'

Runtime preferences are stored in the database and applied on startup.