OpenClaw Best Practices & Optimization Guide 2026

You installed OpenClaw. Your agent runs. But is it running well? Most OpenClaw setups ship with default configuration — and that means you're likely overpaying by 70-90%, using too much context, and leaving performance on the table.

OpenClaw is configurable down to the model provider, memory backend, skill allowlist, and compaction schedule. The defaults are safe, but they're not optimal for anything. Here's how to tune OpenClaw for speed, cost, and reliability in 2026.

1. Model Selection and Routing

The single biggest lever for both cost and quality is choosing the right model for each task. OpenClaw defaults to a single model for everything — but not every task needs Claude Opus 4.7.

Task-Aware Routing

Route each prompt to the cheapest model that can handle it:

Task Type	Recommended Model	Cost per 1M Tokens	Saving vs Opus
Trivial edits, formatting, simple Q&A	Gemini 3 Flash, DeepSeek V4 Flash, GPT-5 Mini	$0.10–0.30	~99%
Mid-complexity refactors, test writing, bug analysis	Claude Sonnet 4.6, GPT-5.4, DeepSeek V4 Pro	$3–15	~80%
Architecture, complex debugging, multi-file refactors	Claude Opus 4.7, GPT-5.5	$30–75	Baseline

For hobbyist and personal use, DeepSeek V4 Flash offers the best performance-to-cost ratio at roughly $0.10/M input tokens. For production teams, a router like ClawRouters or OpenRouter can auto-classify prompts and dispatch to the cheapest capable model with under 50ms overhead.

Model Configuration (2026.4.24+)

As of the April 24, 2026 update, model configuration is provider-catalog driven. The old /models add command is deprecated. Configure models through the provider manifest:

class="language-text"># ~/.openclaw/config.yaml
agents:
 defaults:
 model: deepseek/deepseek-v4-flash
 fallbacks:
 - anthropic/claude-sonnet-4.6
 - openai/gpt-5.4

This sets a primary model with automatic fallback if the primary is unavailable or rate-limited.

2. Memory Configuration

OpenClaw's memory system is Markdown files on disk. There's no hidden state — your agent remembers only what's written to MEMORY.md and memory/YYYY-MM-DD.md.

File Structure

File	Purpose	When Loaded
`MEMORY.md`	Long-term durable facts	Every DM session start
`memory/YYYY-MM-DD.md`	Today's running context	Today + yesterday auto-loaded
`memory/YYYY-MM-DD.md` (older)	Historical context	Loaded via `memory_search` on demand

Memory Backend Options

Backend	Best For	Setup
Builtin (SQLite)	Default, works out of box	No setup needed
QMD	Local-first, indexes external dirs	Plugin install
Honcho	Cross-session user modeling	Plugin install
LanceDB	Ollama embeddings, local-first	Plugin install

For most users, the builtin SQLite backend with hybrid search (vector + keyword) is sufficient. Switch to QMD if you need to index directories outside the workspace, or Honcho for multi-agent memory sharing.

Compaction for Long Sessions

Long sessions accumulate context. OpenClaw handles this with compaction — summarizing older conversation turns so the context window doesn't balloon. To configure:

class="language-text"># ~/.openclaw/config.yaml
context:
 compaction:
 enabled: true
 model: deepseek/deepseek-v4-flash # use cheap model for compaction
 target_tokens: 32000

Use a cheap model like DeepSeek V4 Flash or GPT-5 Mini for compaction. There's no reason to pay Opus rates for summarizing old chat history.

Memory Flush Model

Set an explicit memory-flush model override to keep housekeeping costs low:

class="language-text"># ~/.openclaw/config.yaml
agents:
 defaults:
 memory_flush_model: deepseek/deepseek-v4-flash

3. Skill Management

Skills define what your agent can do. But more isn't always better — each skill adds tokens to the system prompt.

Allowlists Over Unrestricted Access

In multi-agent setups, control which skills each agent can use:

class="language-text">agents:
 defaults:
 skills: ["github", "weather", "search"]
 list:
 - id: writer
 # inherits github, weather, search
 - id: locked-down
 skills: [] # no skills at all
 - id: custom
 skills: ["docs-search", "code-review"] # replaces defaults

Keep Skill Descriptions Concise

Verbose SKILL.md files bloat every prompt. Aim for:

Description: 1-2 sentences. Specific trigger words.
Instructions: Numbered steps, not paragraphs.
Examples: 1-2 short dialogues, not full transcripts.
Config: Documented, but don't include defaults in the agent-facing body.

A well-written 200-word skill is more effective than a 2,000-word novel.

Audit Installed Skills

Run this monthly to review what's loaded:

class="language-text">ls -la ~/.openclaw/workspace/skills/

Remove skills you don't use. Each unused skill is dead weight in every prompt.

4. Cost Optimization

The Real Cost Breakdown

OpenClaw sessions are expensive because of chain amplification — each step re-reads the growing conversation context. A typical heavy session:

Metric	Value
Steps per session	40-60
Context at step 40	~80K tokens
Cost on Opus 4.7	~$5/session
Sessions per engineer/day	~20
Monthly cost per engineer	~$3,000

Four Ways to Cut Costs

Use cheaper models for routine tasks. Formatting, simple edits, commit messages, and documentation Q&A run identically on models costing 1-3% of Opus. Route these away from frontier models.

Enable compaction. Without compaction, every tool call appends to the context window indefinitely. With compaction, older turns are summarized into a compact representation.

Set a memory_flush_model. Housekeeping tasks (memory writes, compaction, search) use your default model by default. Override them to a cheap model explicitly.

Use token caching. Providers like Anthropic and DeepSeek offer prefix caching at dramatically reduced rates. DeepSeek V4 cached input is $0.0036/M — roughly 1/120th of the cache-miss rate.

Realistic Savings

Usage Level	Direct to Opus 4.7	Optimized	Savings
Light (~500K tokens/month)	~$37.50	~$10-15	60-73%
Medium (~5M tokens/month)	~$375	~$50-100	73-87%
Heavy (~20M tokens/month)	~$1,500	~$200-400	73-87%

5. Security and Permissions

Agent Allowlists

Lock down what each agent can access:

class="language-text">agents:
 defaults:
 tools:
 - read
 - write
 - search
 list:
 - id: public-bot
 tools: ["search", "read"] # no write access
 - id: admin-bot
 tools: ["shell", "write", "read"] # full access

Skills as Security Boundaries

Skills execute with the same permissions as the OpenClaw process. A malicious skill can read, write, and execute shell commands. Review community skills before installing, especially ones that request API keys or shell access.

Follow the principle of least privilege: give each agent only the skills it needs, and use allowlists to prevent unauthorized access.

Secrets Management

Store API keys in environment variables or a secrets manager, not in SKILL.md files:

class="language-text"># ~/.openclaw/config.yaml
providers:
 openai:
 api_key: ${OPENAI_API_KEY} # from environment

6. Performance Tuning

Gateway Concurrency

If you run multiple agents, increase the Gateway worker pool:

class="language-text">gateway:
 workers: 4
 max_concurrent: 8

Response Cache

Enable response caching to avoid redundant LLM calls for identical prompts:

class="language-text">cache:
 enabled: true
 provider: sqlite
 ttl_seconds: 3600

Batch Processing

For scheduled tasks (cron jobs, CI/CD integration), batch operations to reduce per-task overhead. OpenClaw's concurrent sub-agent spawning handles parallel execution cleanly.

Plugin Bloat

Each enabled plugin adds overhead to startup and tool resolution. Disable plugins you don't use:

class="language-text">plugins:
 entries:
 discord: true
 telegram: true
 # browser: false # disabled
 # whatsapp: false # disabled

Quick Reference: Optimization Checklist

[ ] Set a task-appropriate default model (not Opus unless you need it)
[ ] Configure fallback models for reliability
[ ] Enable compaction with a cheap compaction model
[ ] Set explicit memory_flush_model
[ ] Review skills directory — remove unused skills
[ ] Audit agent skill allowlists
[ ] Verify you're using token caching where available
[ ] Set tool-level permissions per agent
[ ] Check plugin list — disable what you don't use
[ ] Enable response caching for repeated queries

Frequently Asked Questions

What's the best default model for OpenClaw in 2026?

For personal use: DeepSeek V4 Flash ($0.10/M input). For production: Claude Sonnet 4.6 ($3-15/M input) with DeepSeek V4 Flash as fallback. Reserve Opus 4.7 or GPT-5.5 only for complex architectural work.

How do I reduce OpenClaw costs without losing quality?

Set up task-aware routing so easy tasks (formatting, simple edits) go to cheap models and hard tasks (architecture, debugging) go to frontier models. Enable compaction with a cheap model. Most teams save 70-90% without noticeable quality loss.

My agent forgets things — what's wrong?

Check that the active memory plugin is enabled and an embedding provider is configured. The agent needs an API key for any supported provider (OpenAI, Gemini, Voyage, Mistral) to enable semantic search. Without embeddings, memory_search falls back to keyword matching.

Can too many skills slow down OpenClaw?

Yes. Each skill adds its SKILL.md content to the system prompt. With 50+ verbose skills, you can easily add 5,000+ tokens of overhead to every single request — increasing both latency and cost. Keep your skills directory lean.

I can't find the /models add command — where did it go?

The /models add command was deprecated in the 2026.4.24 update. Model configuration is now provider-catalog driven. Configure models through config.yaml under agents.defaults.model instead.

How do I set up different models for different agents?

Use the agents.list configuration block. Each agent entry can specify its own model, fallbacks, skills, and tools — fully independent from defaults and other agents.

← Back to all posts

OpenClaw Best Practices &amp; Optimization Guide 2026