Ollama Provider

The Ollama provider connects GoClaw to locally-running Ollama for inference, embeddings, and summarization.

Configuration

{
  "llm": {
    "providers": {
      "ollama": {
        "driver": "ollama",
        "url": "http://localhost:11434"
      }
    },
    "summarization": {
      "models": ["ollama/qwen2.5:7b"]
    },
    "embeddings": {
      "models": ["ollama/nomic-embed-text"]
    }
  }
}

Options

FieldTypeDefaultDescription
urlstring-Ollama server URL
maxTokensint-Output token limit
contextTokensintautoContext window override (queried from Ollama)
timeoutSecondsint300Request timeout
embeddingOnlyboolfalseUse only for embeddings

Use Cases

Summarization

Ollama is commonly used for compaction summaries to avoid cloud API costs:

{
  "llm": {
    "providers": {
      "ollama-summarize": {
        "driver": "ollama",
        "url": "http://localhost:11434"
      },
      "claude": {
        "driver": "anthropic",
        "apiKey": "YOUR_API_KEY"
      }
    },
    "agent": {
      "models": ["claude/claude-sonnet-4-20250514"]
    },
    "summarization": {
      "models": ["ollama-summarize/qwen2.5:7b", "claude/claude-3-haiku-20240307"]
    }
  }
}

This uses Ollama for summarization (free, local) with Anthropic as fallback.

Embeddings

For semantic search (memory_search and transcript):

{
  "memory": {
    "enabled": true,
    "query": {
      "maxResults": 6,
      "minScore": 0.35
    }
  }
}

Or via the LLM config:

{
  "llm": {
    "providers": {
      "ollama-embed": {
        "driver": "ollama",
        "url": "http://localhost:11434",
        "embeddingOnly": true
      }
    },
    "embeddings": {
      "models": ["ollama-embed/nomic-embed-text"]
    }
  }
}

Agent (Local-Only)

For fully local operation:

{
  "llm": {
    "providers": {
      "ollama": {
        "driver": "ollama",
        "url": "http://localhost:11434",
        "contextTokens": 131072
      }
    },
    "agent": {
      "models": ["ollama/qwen2.5:32b"]
    },
    "summarization": {
      "models": ["ollama/qwen2.5:7b"]
    },
    "embeddings": {
      "models": ["ollama/nomic-embed-text"]
    }
  }
}
Use CaseModelNotes
Summarizationqwen2.5:7bGood balance of speed and quality
Summarizationllama3.2:3bFaster, lower quality
Embeddingsnomic-embed-textBest for semantic search
Embeddingsall-minilmFaster, smaller vectors
Agentqwen2.5:32bLarge context, good tool use

Context Window

Ollama queries the model’s context size automatically. Override with contextTokens if needed:

{
  "providers": {
    "ollama": {
      "driver": "ollama",
      "url": "http://localhost:11434",
      "contextTokens": 131072
    }
  }
}

Troubleshooting

“Ollama not available”

  1. Check Ollama is running:
    curl http://localhost:11434/api/tags
    
  2. Start Ollama:
    ollama serve
    
  3. Verify URL in config matches server address

“context deadline exceeded”

Increase timeout or use a smaller model:

{
  "providers": {
    "ollama": {
      "driver": "ollama",
      "url": "http://localhost:11434",
      "timeoutSeconds": 600
    }
  }
}

Model Not Found

Pull the model first:

ollama pull qwen2.5:7b
ollama pull nomic-embed-text

Slow Performance

  • Use GPU acceleration if available
  • Try smaller models (7b instead of 14b)
  • Reduce contextTokens if not needed

See Also