Transcript Search

GoClaw indexes all conversations into a searchable database, giving your agent persistent memory that survives context compaction.

Overview

Transcript search solves a fundamental problem with LLM agents: context windows are finite, but conversations are forever.

When your context window fills up, GoClaw compacts old messages into summaries. Without transcript search, the details of those conversations are lost to the agent. With transcript search, your agent can query the full history and recover context on demand.

With Lossless Context Management (LCM) enabled, transcript recall also covers the compacted summaries themselves. That means the agent can search summary cues first, then expand back into raw source messages when precision matters.

Key Features

FeatureDescription
Hybrid SearchCombines semantic embeddings with BM25 keyword matching
Automatic IndexingNew messages indexed every 30 seconds
Embedding BackfillHistorical chunks get embeddings added automatically
OpenClaw ImportMerges OpenClaw conversation history into the index
Real-time SyncNew OpenClaw messages indexed while running side-by-side
Configurable ChunkingControl how messages are grouped into searchable units
Compacted Summary RecallSearch, inspect, and expand LCM summaries after compaction

How It Compares

GoClaw TranscriptsClaude InsightsChatGPT Memory
StorageLocal (SQLite)CloudCloud
PrivacyYour machineAnthropic serversOpenAI servers
PersistencePermanentUnknownLimited
Cross-platformMerges OpenClaw + GoClawSingle platformSingle platform
Search TypeSemantic + KeywordUnknownKeyword?
OfflineYesNoNo

Setup

1. Configure Embedding Provider

Transcript search requires embeddings. Any OpenAI-compatible API works:

Option A: LM Studio (Recommended for local)

{
  "llm": {
    "providers": {
      "lmstudio": {
        "driver": "openai",
        "baseURL": "http://localhost:1234"
      }
    },
    "embeddings": {
      "models": ["lmstudio/text-embedding-nomic-embed-text-v1.5"]
    }
  }
}

Option B: Ollama

{
  "llm": {
    "providers": {
      "ollama": {
        "driver": "ollama",
        "url": "http://localhost:11434"
      }
    },
    "embeddings": {
      "models": ["ollama/nomic-embed-text"]
    }
  }
}

2. Enable Transcript Indexing

{
  "transcript": {
    "enabled": true,
    "indexIntervalSeconds": 30,
    "batchSize": 100,
    "backfillBatchSize": 20
  }
}

3. Verify It’s Working

After starting GoClaw, you should see:

openai: embedding ready name=lmstudio dimensions=768
memory: provider upgraded from=none to=lmstudio
transcript: starting indexer

And periodically:

transcript: sync completed messagesProcessed=5 chunksCreated=2 progress="500/500 (100%)"
transcript: backfill progress processed=20 remaining=150 elapsed=1.2s

Configuration Reference

{
  "transcript": {
    "enabled": true,
    "indexIntervalSeconds": 30,
    "batchSize": 100,
    "backfillBatchSize": 20,
    "maxGroupGapSeconds": 300,
    "maxMessagesPerChunk": 8,
    "maxEmbeddingContentLen": 16000,
    "query": {
      "maxResults": 10,
      "minScore": 0.3,
      "vectorWeight": 0.7,
      "keywordWeight": 0.3
    }
  }
}

Indexing Options

FieldTypeDefaultDescription
enabledbooltrueEnable transcript indexing
indexIntervalSecondsint30How often to check for new messages
batchSizeint100Max messages to process per sync cycle
backfillBatchSizeint10Max chunks to add embeddings to per cycle
maxGroupGapSecondsint300Max time gap (5 min) before starting new chunk
maxMessagesPerChunkint8Max messages per conversation chunk
maxEmbeddingContentLenint16000Max chars to send to embedding model

Search Options

FieldTypeDefaultDescription
query.maxResultsint10Maximum results per search
query.minScorefloat0.3Minimum similarity score (0-1)
query.vectorWeightfloat0.7Weight for semantic search
query.keywordWeightfloat0.3Weight for keyword search

How It Works

Message → Chunk → Embedding

Messages arrive
    ↓
Group by session + time gap (≤5 min)
    ↓
Create conversation chunks (≤8 messages each)
    ↓
Generate embedding via LM Studio/Ollama
    ↓
Store in SQLite with vector index

Chunking Strategy

Messages are grouped into “conversation chunks” based on:

  1. Same session — Messages from the same conversation
  2. Time proximity — Within maxGroupGapSeconds of each other
  3. Size limit — At most maxMessagesPerChunk messages

This creates semantically coherent units that are:

  • Small enough for accurate embeddings
  • Large enough for context (not single messages)
  • Temporally grouped (related discussion stays together)

When searching, GoClaw combines two approaches:

  1. Vector Search (70% weight by default)

    • Query embedded via same model
    • Cosine similarity against all chunks
    • Finds semantically similar content
  2. Keyword Search (30% weight by default)

    • BM25 full-text search
    • Catches exact matches vector might miss
    • Handles names, IDs, specific terms

Final score: vector * 0.7 + keyword * 0.3


Agent Tools

GoClaw now exposes two related recall surfaces through the same transcript tool:

  • Raw transcript search — semantic search, recent messages, exact/hybrid search, gaps, and get_messages
  • Compacted history recallgrep_summaries, describe, and expand

transcript (search action)

Search conversation history:

{
  "tool": "transcript",
  "input": {
    "action": "search",
    "query": "authentication system design decisions"
  }
}

Returns:

{
  "results": [
    {
      "content": "User: What auth approach should we use?\n\nGoClaw: Based on your requirements...",
      "score": 0.82,
      "timestamp": "2024-01-15T14:30:00Z",
      "messageCount": 4
    }
  ],
  "totalResults": 3,
  "searchTime": "45ms"
}

transcript (stats action)

Get indexing statistics:

{
  "tool": "transcript",
  "input": {
    "action": "stats"
  }
}

Returns:

{
  "totalChunks": 495,
  "chunksWithEmbeddings": 495,
  "chunksNeedingEmbeddings": 0,
  "pendingMessages": 0,
  "chunksIndexedSession": 12,
  "lastSync": "2024-01-15T15:00:00Z",
  "provider": "text-embedding-nomic-embed-text-v1.5"
}

Field explanations:

  • totalChunks — Total conversation chunks in database
  • chunksWithEmbeddings — Chunks that have been embedded
  • chunksNeedingEmbeddings — Backlog waiting for embeddings
  • pendingMessages — New messages not yet chunked
  • provider — Current embedding model ("none" if unavailable)

transcript (grep_summaries action)

Search compacted summaries from the current session:

{
  "tool": "transcript",
  "input": {
    "action": "grep_summaries",
    "query": "\"error handling\"",
    "limit": 10,
    "sort": "recency"
  }
}

Returns summary IDs you can pass to describe or expand:

{
  "results": [
    {
      "id": "sum_1772237645628_000001",
      "kind": "leaf",
      "depth": 0,
      "timestamp": "2026-04-20T10:30:00Z",
      "preview": "We decided to centralize error handling in the gateway...",
      "matchType": "fts"
    }
  ],
  "count": 1
}

Tip: FTS5 defaults to AND matching. Short queries work best: 1-3 distinctive terms or one quoted phrase.

transcript (describe action)

Inspect one summary node without expanding its children or raw messages:

{
  "tool": "transcript",
  "input": {
    "action": "describe",
    "summaryId": "sum_1772237645628_000001"
  }
}

This is useful when you already have a summary ID in prompt context and want to confirm its depth, lineage, or summary text before drilling deeper.

transcript (expand action)

Expand one or more compacted summaries back into child summaries and, when needed, raw messages:

{
  "tool": "transcript",
  "input": {
    "action": "expand",
    "summaryIds": ["sum_1772237645628_000001"],
    "tokenCap": 4000,
    "maxDepth": 3,
    "includeMessages": false
  }
}

You can also expand by query:

{
  "tool": "transcript",
  "input": {
    "action": "expand",
    "query": "gateway error handling"
  }
}

Key behavior:

  • summaryIds use the sum_ prefix
  • expand is token-capped, so large results may be truncated
  • If a summary is still pending, GoClaw falls back to the raw source messages automatically
  • Use includeMessages: true when exact wording matters

OpenClaw Integration

Initial Import

On startup, GoClaw imports your OpenClaw conversation history:

session: imported OpenClaw messages to SQLite for transcript indexing imported=123

These messages are stored with source='openclaw' and will be indexed like any other messages.

Real-time Sync

While running side-by-side with OpenClaw, new messages in your OpenClaw session are:

  1. Detected via file watcher
  2. Stored in SQLite
  3. Indexed on next sync cycle
session: stored new OpenClaw messages for transcript indexing count=2

This means conversations in OpenClaw become searchable in GoClaw within ~30 seconds.


Use Cases

Recovering Context After Compaction

Agent: "I don't have the earlier context, but let me search..."
→ transcript(action="search", query="database migration approach")
→ "Found: We decided to use incremental migrations with checksums..."

With LCM enabled, the agent can stay inside the compacted history first:

Agent: "That detail may be in compacted context. Let me check..."
→ transcript(action="grep_summaries", query="database migration")
→ transcript(action="describe", summaryId="sum_...")
→ transcript(action="expand", summaryIds=["sum_..."])

Finding Past Decisions

User: "What did we decide about the caching strategy?"
Agent: [searches transcripts]
→ "On January 10th, we decided to use Redis with a 5-minute TTL..."

Recalling Specific Discussions

User: "Remember when we talked about that weird bug with timezones?"
Agent: [searches "timezone bug"]
→ "Yes, on December 5th we debugged an issue where..."

Cross-Session Context

Unlike in-context memory, transcript search works across:

  • Multiple sessions
  • Before/after compaction
  • OpenClaw and GoClaw conversations

Troubleshooting

“provider: none” in stats

The embedding provider isn’t initialized. Check:

  1. Embedding model configured in llm.embeddings.models
  2. Provider (LM Studio/Ollama) is running
  3. Model is loaded/available

Look for:

openai: embedding ready name=lmstudio dimensions=768
memory: provider upgraded from=none to=lmstudio

Chunks not getting embeddings

Check chunksNeedingEmbeddings in stats. If high:

  • Increase backfillBatchSize for faster catchup
  • Verify embedding provider is working
  • Check logs for embedding errors

Search returns no results

  1. Verify chunks exist: check totalChunks in stats
  2. Lower minScore threshold (try 0.2)
  3. Check query isn’t too vague
  4. Ensure chunksWithEmbeddings > 0

grep_summaries returns no results

Check:

  1. LCM is enabled in session.summarization.compaction.lcm.enabled
  2. The session has actually compacted at least once
  3. Your query is short and distinctive
  4. You are searching the current session’s compacted history, not cross-session data

If you need full-history lookup across all sessions, use transcript search on raw messages instead.

Slow indexing

Embedding generation can be slow. Consider:

  • Using GPU-accelerated inference
  • Increasing indexIntervalSeconds for less frequent batches
  • Using a faster/smaller embedding model

Performance

Storage

  • ~1KB per chunk (768-dim float32 embedding + content + metadata)
  • 500 chunks ≈ 500KB
  • SQLite handles millions of rows efficiently

Search Speed

  • Typical search: 20-100ms
  • Scales with chunk count
  • Vector index provides O(log n) lookup

Memory

  • Embeddings stored in SQLite, not memory
  • Indexer runs in background goroutine
  • Minimal runtime overhead

See Also