LLM Providers
GoClaw supports multiple LLM providers through a unified registry system. This enables flexible model selection, automatic failover, and purpose-specific provider chains.
Supported Providers
| Provider | Type | Use Cases |
|---|---|---|
| Anthropic | Cloud | Agent responses (Claude), extended thinking, prompt caching |
| OpenAI | Cloud/Local | GPT models, OpenAI-compatible APIs (LM Studio, LocalAI) |
| Ollama | Local | Local inference, embeddings, summarization |
| xAI | Cloud | Grok models, stateful conversations, server-side tools |
| Hugot | Local | Built-in embeddings-only provider for semantic search |
Quick Setup
Minimal Config (Single Provider)
For basic usage with Anthropic:
{
"llm": {
"providers": {
"anthropic": {
"driver": "anthropic",
"apiKey": "sk-ant-...",
"promptCaching": true
}
},
"agent": {
"models": ["anthropic/claude-sonnet-4-20250514"]
}
}
}
If you leave the embeddings chain empty, GoClaw automatically restores the built-in hugot-local provider and seeds the default local embeddings model.
Multi-Provider Setup
For advanced setups with multiple providers and purpose-specific chains:
{
"llm": {
"providers": {
"claude": {
"driver": "anthropic",
"apiKey": "sk-ant-...",
"promptCaching": true
},
"ollama-qwen": {
"driver": "ollama",
"url": "http://localhost:11434"
},
"hugot-local": {
"driver": "hugot",
"embeddingOnly": true
}
},
"agent": {
"models": ["claude/claude-sonnet-4-20250514"]
},
"summarization": {
"models": ["ollama-qwen/qwen2.5:7b", "claude/claude-3-haiku-20240307"]
},
"embeddings": {
"models": ["hugot-local/KnightsAnalytics/all-MiniLM-L6-v2"]
}
}
}
Purpose Chains
GoClaw routes LLM requests based on purpose:
| Purpose | Config Key | Used For |
|---|---|---|
agent | agent | Main conversation, tool use |
summarization | summarization | Compaction summaries, checkpoints |
embeddings | embeddings | Semantic search (memory, transcripts, Memory Graph) |
heartbeat | heartbeat | Periodic heartbeat tasks |
cron | cron | Scheduled cron jobs |
hass | hass | Home Assistant queries |
memory_extraction | memoryExtraction | Memory Graph entity extraction |
If a purpose has no models configured, it falls back to the agent chain.
Each purpose has a model chain — the first model is primary, others are fallbacks:
{
"llm": {
"summarization": {
"models": [
"ollama-qwen/qwen2.5:7b",
"claude/claude-3-haiku-20240307"
]
}
}
}
The first model (ollama-qwen/qwen2.5:7b) is tried first. If it fails, the next model in the chain is used as fallback.
Automatic Failover
When a provider fails:
- Error is classified (rate limit, auth, timeout, server error)
- Provider enters cooldown with exponential backoff
- Next model in chain is tried
- After cooldown expires, original provider is tried again
Check provider status with /llm command:
LLM Provider Status
claude: healthy
ollama-qwen: cooldown (rate_limit), retry in 2m30s
ollama-embed: healthy
Thinking Levels
Extended thinking/reasoning can be enabled for supported models. This tells the LLM to “think through” complex problems before responding.
Available Levels
| Level | Description | Anthropic Tokens |
|---|---|---|
off | No extended thinking | 0 |
minimal | Quick responses | 1,024 |
low | Light reasoning | 4,096 |
medium | Balanced (default) | 10,000 |
high | Deep reasoning | 25,000 |
xhigh | Maximum effort | 50,000 |
Configuration
Per-user in users.json:
{
"users": [
{
"name": "Alice",
"role": "owner",
"thinking": true,
"thinkingLevel": "medium"
}
]
}
Or dynamically via Telegram/TUI settings.
Provider Support
| Provider | Thinking Support |
|---|---|
| Anthropic | Yes (Claude 3.5+), token budget |
| OpenAI | Via OpenRouter reasoning |
| Ollama | Model-dependent |
| xAI | Yes (grok-3-mini), effort levels |
Provider Configuration
Common Options
All providers support:
{
"driver": "anthropic", // Required: provider driver
"apiKey": "...", // API key (or env var)
"maxTokens": 8192, // Output limit override
"contextTokens": 200000, // Context window override
"timeoutSeconds": 300, // Request timeout
"trace": true, // Enable request tracing
"dumpOnSuccess": false // Keep request dumps on success
}
Provider-Specific Options
Anthropic:
{
"driver": "anthropic",
"promptCaching": true // Enable prompt caching (reduces cost)
}
OpenAI:
{
"driver": "openai",
"baseURL": "https://api.openai.com/v1" // Or compatible endpoint
}
Ollama:
{
"driver": "ollama",
"url": "http://localhost:11434",
"embeddingOnly": true // Skip chat availability check
}
Hugot (embeddings only):
{
"driver": "hugot",
"embeddingOnly": true
}
Hugot is the built-in local embeddings provider. It is intended for the embeddings purpose, not for agent or summarization.
xAI:
{
"driver": "xai",
"serverToolsAllowed": ["web_search"], // Server-side tools
"maxTurns": 5 // Max agentic turns
}
Account Login (Auth Profiles)
For xAI Grok and OpenAI/Codex, you can sign in with your account instead of pasting a raw API key into goclaw.json. GoClaw stores the resulting tokens in an encrypted auth profile and the provider config only references it.
Set this up in Web or TUI setup, or the config editor, which runs the whole login flow inline (pick the method, then Start Login / Status / Cancel / Disconnect, open or copy the login URL, and paste back a redirect URL or code). The provider config keeps just two fields:
{
"llm": {
"providers": {
"xai": {
"driver": "xai",
"authMethod": "device-code",
"authProfile": "xai:default"
},
"openai": {
"driver": "openai",
"authMethod": "oauth",
"authProfile": "openai-codex:default"
}
}
}
}
Auth Methods
| Method | Description |
|---|---|
api-key | Use the provider’s apiKey field (the default; no profile needed) |
device-code | Pair this install with your account using a short device code — best for remote or headless servers where a browser callback isn’t reachable |
oauth | Log in through the browser; falls back to pasting the redirect URL or code manually when a loopback callback isn’t reachable |
What Gets Stored Where
| Field | Lives in | Holds |
|---|---|---|
authMethod, authProfile | goclaw.json | Which method to use and which profile to read (xai:default, openai-codex:default) |
| OAuth/device tokens | ~/.goclaw/auth-profiles.enc.json (encrypted) | The actual refresh/access tokens |
The encryption key is resolved from GOCLAW_AUTH_PROFILE_SECRET_KEY, then GOCLAW_AUTH_PROFILE_SECRET_FILE, then an auto-generated ~/.goclaw/auth-profile.key (created with 0600). See Configuration
for the key formats.
The xAI xai_imagine and xai_video tools and web search can reuse the same xai:default profile, so an account login covers them too without a separate key. Auth profiles are currently supported for the xAI and OpenAI/Codex drivers; other providers continue to use apiKey.
Model Reference Format
Models are referenced as provider/model:
claude/claude-sonnet-4-20250514
ollama-qwen/qwen2.5:7b
openai/gpt-4o
xai/grok-3
hugot-local/KnightsAnalytics/all-MiniLM-L6-v2
The provider name is the key from your providers config, not the provider type.
Cooldown Management
View Status
/llm
Shows all providers, their status, and any cooldowns.
Reset Cooldowns
/llm reset
Clears active provider cooldowns so model chains can retry immediately.
Cooldown Behavior
| Error Type | Initial Cooldown | Max Cooldown |
|---|---|---|
| Rate limit | 30s | 5 min |
| Auth error | 1 hour | 1 hour |
| Server error | 1 min | 10 min |
| Timeout | 30s | 5 min |
Cooldowns use exponential backoff within these ranges.
See Also
- Anthropic Provider — Claude models, prompt caching
- OpenAI Provider — GPT and compatible APIs
- Ollama Provider — Local inference
- xAI Provider — Grok models
- Embeddings — Embeddings purpose configuration
- Memory Graph — Memory extraction purpose
- Configuration — Full config reference
- Session Management — Summarization config