Transcript Search
GoClaw indexes all conversations into a searchable database, giving your agent persistent memory that survives context compaction.
Overview
Transcript search solves a fundamental problem with LLM agents: context windows are finite, but conversations are forever.
When your context window fills up, GoClaw compacts old messages into summaries. Without transcript search, the details of those conversations are lost to the agent. With transcript search, your agent can query the full history and recover context on demand.
With Lossless Context Management (LCM) enabled, transcript recall also covers the compacted summaries themselves. That means the agent can search summary cues first, then expand back into raw source messages when precision matters.
Key Features
| Feature | Description |
|---|---|
| Hybrid Search | Combines semantic embeddings with BM25 keyword matching |
| Automatic Indexing | New messages indexed every 30 seconds |
| Embedding Backfill | Historical chunks get embeddings added automatically |
| OpenClaw Import | Merges OpenClaw conversation history into the index |
| Real-time Sync | New OpenClaw messages indexed while running side-by-side |
| Configurable Chunking | Control how messages are grouped into searchable units |
| Compacted Summary Recall | Search, inspect, and expand LCM summaries after compaction |
How It Compares
| GoClaw Transcripts | Claude Insights | ChatGPT Memory | |
|---|---|---|---|
| Storage | Local (SQLite) | Cloud | Cloud |
| Privacy | Your machine | Anthropic servers | OpenAI servers |
| Persistence | Permanent | Unknown | Limited |
| Cross-platform | Merges OpenClaw + GoClaw | Single platform | Single platform |
| Search Type | Semantic + Keyword | Unknown | Keyword? |
| Offline | Yes | No | No |
Setup
1. Configure Embedding Provider
Transcript search requires embeddings. Any OpenAI-compatible API works:
Option A: LM Studio (Recommended for local)
{
"llm": {
"providers": {
"lmstudio": {
"driver": "openai",
"baseURL": "http://localhost:1234"
}
},
"embeddings": {
"models": ["lmstudio/text-embedding-nomic-embed-text-v1.5"]
}
}
}
Option B: Ollama
{
"llm": {
"providers": {
"ollama": {
"driver": "ollama",
"url": "http://localhost:11434"
}
},
"embeddings": {
"models": ["ollama/nomic-embed-text"]
}
}
}
2. Enable Transcript Indexing
{
"transcript": {
"enabled": true,
"indexIntervalSeconds": 30,
"batchSize": 100,
"backfillBatchSize": 20
}
}
3. Verify It’s Working
After starting GoClaw, you should see:
openai: embedding ready name=lmstudio dimensions=768
memory: provider upgraded from=none to=lmstudio
transcript: starting indexer
And periodically:
transcript: sync completed messagesProcessed=5 chunksCreated=2 progress="500/500 (100%)"
transcript: backfill progress processed=20 remaining=150 elapsed=1.2s
Configuration Reference
{
"transcript": {
"enabled": true,
"indexIntervalSeconds": 30,
"batchSize": 100,
"backfillBatchSize": 20,
"maxGroupGapSeconds": 300,
"maxMessagesPerChunk": 8,
"maxEmbeddingContentLen": 16000,
"query": {
"maxResults": 10,
"minScore": 0.3,
"vectorWeight": 0.7,
"keywordWeight": 0.3
}
}
}
Indexing Options
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable transcript indexing |
indexIntervalSeconds | int | 30 | How often to check for new messages |
batchSize | int | 100 | Max messages to process per sync cycle |
backfillBatchSize | int | 10 | Max chunks to add embeddings to per cycle |
maxGroupGapSeconds | int | 300 | Max time gap (5 min) before starting new chunk |
maxMessagesPerChunk | int | 8 | Max messages per conversation chunk |
maxEmbeddingContentLen | int | 16000 | Max chars to send to embedding model |
Search Options
| Field | Type | Default | Description |
|---|---|---|---|
query.maxResults | int | 10 | Maximum results per search |
query.minScore | float | 0.3 | Minimum similarity score (0-1) |
query.vectorWeight | float | 0.7 | Weight for semantic search |
query.keywordWeight | float | 0.3 | Weight for keyword search |
How It Works
Message → Chunk → Embedding
Messages arrive
↓
Group by session + time gap (≤5 min)
↓
Create conversation chunks (≤8 messages each)
↓
Generate embedding via LM Studio/Ollama
↓
Store in SQLite with vector index
Chunking Strategy
Messages are grouped into “conversation chunks” based on:
- Same session — Messages from the same conversation
- Time proximity — Within
maxGroupGapSecondsof each other - Size limit — At most
maxMessagesPerChunkmessages
This creates semantically coherent units that are:
- Small enough for accurate embeddings
- Large enough for context (not single messages)
- Temporally grouped (related discussion stays together)
Hybrid Search
When searching, GoClaw combines two approaches:
Vector Search (70% weight by default)
- Query embedded via same model
- Cosine similarity against all chunks
- Finds semantically similar content
Keyword Search (30% weight by default)
- BM25 full-text search
- Catches exact matches vector might miss
- Handles names, IDs, specific terms
Final score: vector * 0.7 + keyword * 0.3
Agent Tools
GoClaw now exposes two related recall surfaces through the same transcript tool:
- Raw transcript search — semantic search, recent messages, exact/hybrid search, gaps, and
get_messages - Compacted history recall —
grep_summaries,describe, andexpand
transcript (search action)
Search conversation history:
{
"tool": "transcript",
"input": {
"action": "search",
"query": "authentication system design decisions"
}
}
Returns:
{
"results": [
{
"content": "User: What auth approach should we use?\n\nGoClaw: Based on your requirements...",
"score": 0.82,
"timestamp": "2024-01-15T14:30:00Z",
"messageCount": 4
}
],
"totalResults": 3,
"searchTime": "45ms"
}
transcript (stats action)
Get indexing statistics:
{
"tool": "transcript",
"input": {
"action": "stats"
}
}
Returns:
{
"totalChunks": 495,
"chunksWithEmbeddings": 495,
"chunksNeedingEmbeddings": 0,
"pendingMessages": 0,
"chunksIndexedSession": 12,
"lastSync": "2024-01-15T15:00:00Z",
"provider": "text-embedding-nomic-embed-text-v1.5"
}
Field explanations:
totalChunks— Total conversation chunks in databasechunksWithEmbeddings— Chunks that have been embeddedchunksNeedingEmbeddings— Backlog waiting for embeddingspendingMessages— New messages not yet chunkedprovider— Current embedding model ("none"if unavailable)
transcript (grep_summaries action)
Search compacted summaries from the current session:
{
"tool": "transcript",
"input": {
"action": "grep_summaries",
"query": "\"error handling\"",
"limit": 10,
"sort": "recency"
}
}
Returns summary IDs you can pass to describe or expand:
{
"results": [
{
"id": "sum_1772237645628_000001",
"kind": "leaf",
"depth": 0,
"timestamp": "2026-04-20T10:30:00Z",
"preview": "We decided to centralize error handling in the gateway...",
"matchType": "fts"
}
],
"count": 1
}
Tip: FTS5 defaults to AND matching. Short queries work best: 1-3 distinctive terms or one quoted phrase.
transcript (describe action)
Inspect one summary node without expanding its children or raw messages:
{
"tool": "transcript",
"input": {
"action": "describe",
"summaryId": "sum_1772237645628_000001"
}
}
This is useful when you already have a summary ID in prompt context and want to confirm its depth, lineage, or summary text before drilling deeper.
transcript (expand action)
Expand one or more compacted summaries back into child summaries and, when needed, raw messages:
{
"tool": "transcript",
"input": {
"action": "expand",
"summaryIds": ["sum_1772237645628_000001"],
"tokenCap": 4000,
"maxDepth": 3,
"includeMessages": false
}
}
You can also expand by query:
{
"tool": "transcript",
"input": {
"action": "expand",
"query": "gateway error handling"
}
}
Key behavior:
summaryIdsuse thesum_prefixexpandis token-capped, so large results may be truncated- If a summary is still pending, GoClaw falls back to the raw source messages automatically
- Use
includeMessages: truewhen exact wording matters
OpenClaw Integration
Initial Import
On startup, GoClaw imports your OpenClaw conversation history:
session: imported OpenClaw messages to SQLite for transcript indexing imported=123
These messages are stored with source='openclaw' and will be indexed like any other messages.
Real-time Sync
While running side-by-side with OpenClaw, new messages in your OpenClaw session are:
- Detected via file watcher
- Stored in SQLite
- Indexed on next sync cycle
session: stored new OpenClaw messages for transcript indexing count=2
This means conversations in OpenClaw become searchable in GoClaw within ~30 seconds.
Use Cases
Recovering Context After Compaction
Agent: "I don't have the earlier context, but let me search..."
→ transcript(action="search", query="database migration approach")
→ "Found: We decided to use incremental migrations with checksums..."
With LCM enabled, the agent can stay inside the compacted history first:
Agent: "That detail may be in compacted context. Let me check..."
→ transcript(action="grep_summaries", query="database migration")
→ transcript(action="describe", summaryId="sum_...")
→ transcript(action="expand", summaryIds=["sum_..."])
Finding Past Decisions
User: "What did we decide about the caching strategy?"
Agent: [searches transcripts]
→ "On January 10th, we decided to use Redis with a 5-minute TTL..."
Recalling Specific Discussions
User: "Remember when we talked about that weird bug with timezones?"
Agent: [searches "timezone bug"]
→ "Yes, on December 5th we debugged an issue where..."
Cross-Session Context
Unlike in-context memory, transcript search works across:
- Multiple sessions
- Before/after compaction
- OpenClaw and GoClaw conversations
Troubleshooting
“provider: none” in stats
The embedding provider isn’t initialized. Check:
- Embedding model configured in
llm.embeddings.models - Provider (LM Studio/Ollama) is running
- Model is loaded/available
Look for:
openai: embedding ready name=lmstudio dimensions=768
memory: provider upgraded from=none to=lmstudio
Chunks not getting embeddings
Check chunksNeedingEmbeddings in stats. If high:
- Increase
backfillBatchSizefor faster catchup - Verify embedding provider is working
- Check logs for embedding errors
Search returns no results
- Verify chunks exist: check
totalChunksin stats - Lower
minScorethreshold (try0.2) - Check query isn’t too vague
- Ensure
chunksWithEmbeddings > 0
grep_summaries returns no results
Check:
- LCM is enabled in
session.summarization.compaction.lcm.enabled - The session has actually compacted at least once
- Your query is short and distinctive
- You are searching the current session’s compacted history, not cross-session data
If you need full-history lookup across all sessions, use transcript search on raw messages instead.
Slow indexing
Embedding generation can be slow. Consider:
- Using GPU-accelerated inference
- Increasing
indexIntervalSecondsfor less frequent batches - Using a faster/smaller embedding model
Performance
Storage
- ~1KB per chunk (768-dim float32 embedding + content + metadata)
- 500 chunks ≈ 500KB
- SQLite handles millions of rows efficiently
Search Speed
- Typical search: 20-100ms
- Scales with chunk count
- Vector index provides O(log n) lookup
Memory
- Embeddings stored in SQLite, not memory
- Indexer runs in background goroutine
- Minimal runtime overhead
See Also
- Embeddings — Embedding configuration
- Agent Memory — Memory system overview
- Memory Search — Search workspace memory files
- Session Management — Compaction and checkpoints
- Configuration — Full config reference