OpenClaw Best Practices & Optimization Guide 2026
You installed OpenClaw. Your agent runs. But is it running well? Most OpenClaw setups ship with default configuration โ and that means you're likely overpaying by 70-90%, using too much context, and leaving performance on the table.
OpenClaw is configurable down to the model provider, memory backend, skill allowlist, and compaction schedule. The defaults are safe, but they're not optimal for anything. Here's how to tune OpenClaw for speed, cost, and reliability in 2026.
1. Model Selection and Routing
The single biggest lever for both cost and quality is choosing the right model for each task. OpenClaw defaults to a single model for everything โ but not every task needs Claude Opus 4.7.
Task-Aware Routing
Route each prompt to the cheapest model that can handle it:
| Task Type | Recommended Model | Cost per 1M Tokens | Saving vs Opus |
|---|---|---|---|
| Trivial edits, formatting, simple Q&A | Gemini 3 Flash, DeepSeek V4 Flash, GPT-5 Mini | $0.10โ0.30 | ~99% |
| Mid-complexity refactors, test writing, bug analysis | Claude Sonnet 4.6, GPT-5.4, DeepSeek V4 Pro | $3โ15 | ~80% |
| Architecture, complex debugging, multi-file refactors | Claude Opus 4.7, GPT-5.5 | $30โ75 | Baseline |
For hobbyist and personal use, DeepSeek V4 Flash offers the best performance-to-cost ratio at roughly $0.10/M input tokens. For production teams, a router like ClawRouters or OpenRouter can auto-classify prompts and dispatch to the cheapest capable model with under 50ms overhead.
Model Configuration (2026.4.24+)
As of the April 24, 2026 update, model configuration is provider-catalog driven. The old /models add command is deprecated. Configure models through the provider manifest:
class="language-text"># ~/.openclaw/config.yaml
agents:
defaults:
model: deepseek/deepseek-v4-flash
fallbacks:
- anthropic/claude-sonnet-4.6
- openai/gpt-5.4
This sets a primary model with automatic fallback if the primary is unavailable or rate-limited.
2. Memory Configuration
OpenClaw's memory system is Markdown files on disk. There's no hidden state โ your agent remembers only what's written to MEMORY.md and memory/YYYY-MM-DD.md.
File Structure
| File | Purpose | When Loaded |
|---|---|---|
MEMORY.md | Long-term durable facts | Every DM session start |
memory/YYYY-MM-DD.md | Today's running context | Today + yesterday auto-loaded |
memory/YYYY-MM-DD.md (older) | Historical context | Loaded via memory_search on demand |
Memory Backend Options
| Backend | Best For | Setup |
|---|---|---|
| Builtin (SQLite) | Default, works out of box | No setup needed |
| QMD | Local-first, indexes external dirs | Plugin install |
| Honcho | Cross-session user modeling | Plugin install |
| LanceDB | Ollama embeddings, local-first | Plugin install |
For most users, the builtin SQLite backend with hybrid search (vector + keyword) is sufficient. Switch to QMD if you need to index directories outside the workspace, or Honcho for multi-agent memory sharing.
Compaction for Long Sessions
Long sessions accumulate context. OpenClaw handles this with compaction โ summarizing older conversation turns so the context window doesn't balloon. To configure:
class="language-text"># ~/.openclaw/config.yaml
context:
compaction:
enabled: true
model: deepseek/deepseek-v4-flash # use cheap model for compaction
target_tokens: 32000
Use a cheap model like DeepSeek V4 Flash or GPT-5 Mini for compaction. There's no reason to pay Opus rates for summarizing old chat history.
Memory Flush Model
Set an explicit memory-flush model override to keep housekeeping costs low:
class="language-text"># ~/.openclaw/config.yaml
agents:
defaults:
memory_flush_model: deepseek/deepseek-v4-flash
3. Skill Management
Skills define what your agent can do. But more isn't always better โ each skill adds tokens to the system prompt.
Allowlists Over Unrestricted Access
In multi-agent setups, control which skills each agent can use:
class="language-text">agents:
defaults:
skills: ["github", "weather", "search"]
list:
- id: writer
# inherits github, weather, search
- id: locked-down
skills: [] # no skills at all
- id: custom
skills: ["docs-search", "code-review"] # replaces defaults
Keep Skill Descriptions Concise
Verbose SKILL.md files bloat every prompt. Aim for:
- Description: 1-2 sentences. Specific trigger words.
- Instructions: Numbered steps, not paragraphs.
- Examples: 1-2 short dialogues, not full transcripts.
- Config: Documented, but don't include defaults in the agent-facing body.
A well-written 200-word skill is more effective than a 2,000-word novel.
Audit Installed Skills
Run this monthly to review what's loaded:
class="language-text">ls -la ~/.openclaw/workspace/skills/
Remove skills you don't use. Each unused skill is dead weight in every prompt.
4. Cost Optimization
The Real Cost Breakdown
OpenClaw sessions are expensive because of chain amplification โ each step re-reads the growing conversation context. A typical heavy session:
| Metric | Value |
|---|---|
| Steps per session | 40-60 |
| Context at step 40 | ~80K tokens |
| Cost on Opus 4.7 | ~$5/session |
| Sessions per engineer/day | ~20 |
| Monthly cost per engineer | ~$3,000 |
Four Ways to Cut Costs
Use cheaper models for routine tasks. Formatting, simple edits, commit messages, and documentation Q&A run identically on models costing 1-3% of Opus. Route these away from frontier models.
Enable compaction. Without compaction, every tool call appends to the context window indefinitely. With compaction, older turns are summarized into a compact representation.
Set a memory_flush_model. Housekeeping tasks (memory writes, compaction, search) use your default model by default. Override them to a cheap model explicitly.
Use token caching. Providers like Anthropic and DeepSeek offer prefix caching at dramatically reduced rates. DeepSeek V4 cached input is $0.0036/M โ roughly 1/120th of the cache-miss rate.
Realistic Savings
| Usage Level | Direct to Opus 4.7 | Optimized | Savings |
|---|---|---|---|
| Light (~500K tokens/month) | ~$37.50 | ~$10-15 | 60-73% |
| Medium (~5M tokens/month) | ~$375 | ~$50-100 | 73-87% |
| Heavy (~20M tokens/month) | ~$1,500 | ~$200-400 | 73-87% |
5. Security and Permissions
Agent Allowlists
Lock down what each agent can access:
class="language-text">agents:
defaults:
tools:
- read
- write
- search
list:
- id: public-bot
tools: ["search", "read"] # no write access
- id: admin-bot
tools: ["shell", "write", "read"] # full access
Skills as Security Boundaries
Skills execute with the same permissions as the OpenClaw process. A malicious skill can read, write, and execute shell commands. Review community skills before installing, especially ones that request API keys or shell access.
Follow the principle of least privilege: give each agent only the skills it needs, and use allowlists to prevent unauthorized access.
Secrets Management
Store API keys in environment variables or a secrets manager, not in SKILL.md files:
class="language-text"># ~/.openclaw/config.yaml
providers:
openai:
api_key: ${OPENAI_API_KEY} # from environment
6. Performance Tuning
Gateway Concurrency
If you run multiple agents, increase the Gateway worker pool:
class="language-text">gateway:
workers: 4
max_concurrent: 8
Response Cache
Enable response caching to avoid redundant LLM calls for identical prompts:
class="language-text">cache:
enabled: true
provider: sqlite
ttl_seconds: 3600
Batch Processing
For scheduled tasks (cron jobs, CI/CD integration), batch operations to reduce per-task overhead. OpenClaw's concurrent sub-agent spawning handles parallel execution cleanly.
Plugin Bloat
Each enabled plugin adds overhead to startup and tool resolution. Disable plugins you don't use:
class="language-text">plugins:
entries:
discord: true
telegram: true
# browser: false # disabled
# whatsapp: false # disabled
Quick Reference: Optimization Checklist
- [ ] Set a task-appropriate default model (not Opus unless you need it)
- [ ] Configure fallback models for reliability
- [ ] Enable compaction with a cheap compaction model
- [ ] Set explicit memory_flush_model
- [ ] Review skills directory โ remove unused skills
- [ ] Audit agent skill allowlists
- [ ] Verify you're using token caching where available
- [ ] Set tool-level permissions per agent
- [ ] Check plugin list โ disable what you don't use
- [ ] Enable response caching for repeated queries
Frequently Asked Questions
What's the best default model for OpenClaw in 2026?
For personal use: DeepSeek V4 Flash ($0.10/M input). For production: Claude Sonnet 4.6 ($3-15/M input) with DeepSeek V4 Flash as fallback. Reserve Opus 4.7 or GPT-5.5 only for complex architectural work.
How do I reduce OpenClaw costs without losing quality?
Set up task-aware routing so easy tasks (formatting, simple edits) go to cheap models and hard tasks (architecture, debugging) go to frontier models. Enable compaction with a cheap model. Most teams save 70-90% without noticeable quality loss.
My agent forgets things โ what's wrong?
Check that the active memory plugin is enabled and an embedding provider is configured. The agent needs an API key for any supported provider (OpenAI, Gemini, Voyage, Mistral) to enable semantic search. Without embeddings, memory_search falls back to keyword matching.
Can too many skills slow down OpenClaw?
Yes. Each skill adds its SKILL.md content to the system prompt. With 50+ verbose skills, you can easily add 5,000+ tokens of overhead to every single request โ increasing both latency and cost. Keep your skills directory lean.
I can't find the /models add command โ where did it go?
The /models add command was deprecated in the 2026.4.24 update. Model configuration is now provider-catalog driven. Configure models through config.yaml under agents.defaults.model instead.
How do I set up different models for different agents?
Use the agents.list configuration block. Each agent entry can specify its own model, fallbacks, skills, and tools โ fully independent from defaults and other agents.