LLM Providers

GoClaw supports multiple LLM providers through a unified registry system. This enables flexible model selection, automatic failover, and purpose-specific provider chains.

Supported Providers

ProviderTypeUse Cases
AnthropicCloudAgent responses (Claude), extended thinking, prompt caching
OpenAICloud/LocalGPT models, OpenAI-compatible APIs (LM Studio, LocalAI)
OllamaLocalLocal inference, embeddings, summarization
xAICloudGrok models, stateful conversations, server-side tools
HugotLocalBuilt-in embeddings-only provider for semantic search

Quick Setup

Minimal Config (Single Provider)

For basic usage with Anthropic:

{
  "llm": {
    "providers": {
      "anthropic": {
        "driver": "anthropic",
        "apiKey": "sk-ant-...",
        "promptCaching": true
      }
    },
    "agent": {
      "models": ["anthropic/claude-sonnet-4-20250514"]
    }
  }
}

If you leave the embeddings chain empty, GoClaw automatically restores the built-in hugot-local provider and seeds the default local embeddings model.

Multi-Provider Setup

For advanced setups with multiple providers and purpose-specific chains:

{
  "llm": {
    "providers": {
      "claude": {
        "driver": "anthropic",
        "apiKey": "sk-ant-...",
        "promptCaching": true
      },
      "ollama-qwen": {
        "driver": "ollama",
        "url": "http://localhost:11434"
      },
      "hugot-local": {
        "driver": "hugot",
        "embeddingOnly": true
      }
    },
    "agent": {
      "models": ["claude/claude-sonnet-4-20250514"]
    },
    "summarization": {
      "models": ["ollama-qwen/qwen2.5:7b", "claude/claude-3-haiku-20240307"]
    },
    "embeddings": {
      "models": ["hugot-local/KnightsAnalytics/all-MiniLM-L6-v2"]
    }
  }
}

Purpose Chains

GoClaw routes LLM requests based on purpose:

PurposeConfig KeyUsed For
agentagentMain conversation, tool use
summarizationsummarizationCompaction summaries, checkpoints
embeddingsembeddingsSemantic search (memory, transcripts, Memory Graph)
heartbeatheartbeatPeriodic heartbeat tasks
croncronScheduled cron jobs
hasshassHome Assistant queries
memory_extractionmemoryExtractionMemory Graph entity extraction

If a purpose has no models configured, it falls back to the agent chain.

Each purpose has a model chain — the first model is primary, others are fallbacks:

{
  "llm": {
    "summarization": {
      "models": [
        "ollama-qwen/qwen2.5:7b",
        "claude/claude-3-haiku-20240307"
      ]
    }
  }
}

The first model (ollama-qwen/qwen2.5:7b) is tried first. If it fails, the next model in the chain is used as fallback.

Automatic Failover

When a provider fails:

  1. Error is classified (rate limit, auth, timeout, server error)
  2. Provider enters cooldown with exponential backoff
  3. Next model in chain is tried
  4. After cooldown expires, original provider is tried again

Check provider status with /llm command:

LLM Provider Status

claude: healthy
ollama-qwen: cooldown (rate_limit), retry in 2m30s
ollama-embed: healthy

Thinking Levels

Extended thinking/reasoning can be enabled for supported models. This tells the LLM to “think through” complex problems before responding.

Available Levels

LevelDescriptionAnthropic Tokens
offNo extended thinking0
minimalQuick responses1,024
lowLight reasoning4,096
mediumBalanced (default)10,000
highDeep reasoning25,000
xhighMaximum effort50,000

Configuration

Per-user in users.json:

{
  "users": [
    {
      "name": "Alice",
      "role": "owner",
      "thinking": true,
      "thinkingLevel": "medium"
    }
  ]
}

Or dynamically via Telegram/TUI settings.

Provider Support

ProviderThinking Support
AnthropicYes (Claude 3.5+), token budget
OpenAIVia OpenRouter reasoning
OllamaModel-dependent
xAIYes (grok-3-mini), effort levels

Provider Configuration

Common Options

All providers support:

{
  "driver": "anthropic",       // Required: provider driver
  "apiKey": "...",             // API key (or env var)
  "maxTokens": 8192,           // Output limit override
  "contextTokens": 200000,     // Context window override
  "timeoutSeconds": 300,       // Request timeout
  "trace": true,               // Enable request tracing
  "dumpOnSuccess": false       // Keep request dumps on success
}

Provider-Specific Options

Anthropic:

{
  "driver": "anthropic",
  "promptCaching": true        // Enable prompt caching (reduces cost)
}

OpenAI:

{
  "driver": "openai",
  "baseURL": "https://api.openai.com/v1"  // Or compatible endpoint
}

Ollama:

{
  "driver": "ollama",
  "url": "http://localhost:11434",
  "embeddingOnly": true        // Skip chat availability check
}

Hugot (embeddings only):

{
  "driver": "hugot",
  "embeddingOnly": true
}

Hugot is the built-in local embeddings provider. It is intended for the embeddings purpose, not for agent or summarization.

xAI:

{
  "driver": "xai",
  "serverToolsAllowed": ["web_search"],  // Server-side tools
  "maxTurns": 5                // Max agentic turns
}

Account Login (Auth Profiles)

For xAI Grok and OpenAI/Codex, you can sign in with your account instead of pasting a raw API key into goclaw.json. GoClaw stores the resulting tokens in an encrypted auth profile and the provider config only references it.

Set this up in Web or TUI setup, or the config editor, which runs the whole login flow inline (pick the method, then Start Login / Status / Cancel / Disconnect, open or copy the login URL, and paste back a redirect URL or code). The provider config keeps just two fields:

{
  "llm": {
    "providers": {
      "xai": {
        "driver": "xai",
        "authMethod": "device-code",
        "authProfile": "xai:default"
      },
      "openai": {
        "driver": "openai",
        "authMethod": "oauth",
        "authProfile": "openai-codex:default"
      }
    }
  }
}

Auth Methods

MethodDescription
api-keyUse the provider’s apiKey field (the default; no profile needed)
device-codePair this install with your account using a short device code — best for remote or headless servers where a browser callback isn’t reachable
oauthLog in through the browser; falls back to pasting the redirect URL or code manually when a loopback callback isn’t reachable

What Gets Stored Where

FieldLives inHolds
authMethod, authProfilegoclaw.jsonWhich method to use and which profile to read (xai:default, openai-codex:default)
OAuth/device tokens~/.goclaw/auth-profiles.enc.json (encrypted)The actual refresh/access tokens

The encryption key is resolved from GOCLAW_AUTH_PROFILE_SECRET_KEY, then GOCLAW_AUTH_PROFILE_SECRET_FILE, then an auto-generated ~/.goclaw/auth-profile.key (created with 0600). See Configuration for the key formats.

The xAI xai_imagine and xai_video tools and web search can reuse the same xai:default profile, so an account login covers them too without a separate key. Auth profiles are currently supported for the xAI and OpenAI/Codex drivers; other providers continue to use apiKey.


Model Reference Format

Models are referenced as provider/model:

claude/claude-sonnet-4-20250514
ollama-qwen/qwen2.5:7b
openai/gpt-4o
xai/grok-3
hugot-local/KnightsAnalytics/all-MiniLM-L6-v2

The provider name is the key from your providers config, not the provider type.


Cooldown Management

View Status

/llm

Shows all providers, their status, and any cooldowns.

Reset Cooldowns

/llm reset

Clears active provider cooldowns so model chains can retry immediately.

Cooldown Behavior

Error TypeInitial CooldownMax Cooldown
Rate limit30s5 min
Auth error1 hour1 hour
Server error1 min10 min
Timeout30s5 min

Cooldowns use exponential backoff within these ranges.


See Also