Lesson 05 — Multi-Model & Failover: Get the Most Out of AI on a Budget
Goal: Configure multiple model providers, implement automatic failover, use affordable models for everyday tasks, and automatically upgrade for complex ones.
Why Multiple Models?
| Scenario | Recommended Model |
|---|---|
| Everyday Q&A, translation | MiniMax M2.1 (affordable) |
| Complex reasoning, code architecture | Claude Opus or MiniMax M2.5 (pricier but more powerful) |
| Primary model down / rate-limited | Automatically switch to backup model (no interruption) |
Scenario 1: MiniMax Primary + Claude Fallback
Edit ~/.openclaw/openclaw.json:
{
"gateway": { "mode": "local" },
"env": {
"MINIMAX_API_KEY": "${MINIMAX_API_KEY}",
"ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}"
},
"agents": {
"defaults": {
"model": {
"primary": "minimax/MiniMax-M2.1",
"fallbacks": ["anthropic/claude-sonnet-4-6"]
}
}
},
"models": {
"mode": "merge",
"providers": {
"minimax": {
"baseUrl": "https://api.minimax.io/anthropic",
"apiKey": "${MINIMAX_API_KEY}",
"api": "anthropic-messages",
"models": [
{
"id": "MiniMax-M2.1",
"name": "MiniMax M2.1",
"reasoning": false,
"input": ["text"],
"cost": { "input": 15, "output": 60, "cacheRead": 2, "cacheWrite": 10 },
"contextWindow": 200000,
"maxTokens": 8192
}
]
}
}
}
}When MiniMax returns an error or times out, it automatically switches to Claude Sonnet — completely transparent to the user.
Scenario 2: Use Different Models for Different Tasks
Switch models manually during a conversation:
# Switch to a reasoning model for complex tasks
pnpm openclaw models set minimax/MiniMax-M2.5
# Switch back to the cheaper one when done
pnpm openclaw models set minimax/MiniMax-M2.1Or switch via slash command (if model-switching Skill is enabled):
/model M2.5
Help me design the architecture for a distributed caching system
Scenario 3: Claude Opus Primary + MiniMax Fallback (Economy Mode)
{
"agents": {
"defaults": {
"models": {
"anthropic/claude-opus-4-6": { "alias": "opus" },
"minimax/MiniMax-M2.1": { "alias": "minimax" }
},
"model": {
"primary": "anthropic/claude-opus-4-6",
"fallbacks": ["minimax/MiniMax-M2.1"]
}
}
}
}Scenario 4: Full Multi-Model Config (Triple Redundancy)
{
"env": {
"MINIMAX_API_KEY": "${MINIMAX_API_KEY}",
"ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}",
"OPENAI_API_KEY": "${OPENAI_API_KEY}"
},
"agents": {
"defaults": {
"model": {
"primary": "minimax/MiniMax-M2.1",
"fallbacks": [
"anthropic/claude-sonnet-4-6",
"openai/gpt-4o"
]
}
}
}
}If any one goes down, it automatically moves to the next — no interruptions, ever.
Check Which Model You're Currently Using
In a conversation, send:
/status
Returns something like:
Model: minimax/MiniMax-M2.1
Context: 4,821 / 200,000 tokens
Session: main
View Token Usage and Cost
/usage
Returns the token consumption and estimated cost for the current session, helping you keep costs under control.
Thinking Level
MiniMax M2.5 and Claude Opus support "deep thinking" mode, which consumes more tokens but produces more accurate answers:
/think high
Analyze the time complexity of this code and suggest optimizations:
[paste code]
Thinking levels: off / minimal / low / medium / high / xhigh
Use off for everyday Q&A; use high for complex tasks.
Model Selection Guide
| Task Type | Recommended Model | Thinking Level |
|---|---|---|
| Everyday Q&A | MiniMax M2.1 | off |
| Code generation | MiniMax M2.1 Lightning | low |
| Code review | MiniMax M2.5 | medium |
| System design | MiniMax M2.5 / Claude Opus | high |
| Math reasoning | MiniMax M2.5 | xhigh |
FAQ
What triggers a failover?
A failover is automatically triggered when the primary model returns: HTTP 5xx errors, request timeouts, rate limiting (429), or model service unavailability. OpenClaw tries each entry in the fallbacks array in order, completely transparent to the user.
How do I know which model is actually being used?
Send /status in a conversation — the response will show the full ID of the currently active model (e.g., minimax/MiniMax-M2.1). If a failover occurred, it will be logged: tail -f /tmp/openclaw/openclaw-$(date +%Y-%m-%d).log | grep fallback.
How is token cost calculated for different models?
Cost is calculated according to each provider's official pricing, configured in the models[].cost field in openclaw.json (unit: cents per million tokens). Run /usage to see token consumption and estimated cost for the current session.
Can I manually force-switch models mid-conversation?
Yes. Use pnpm openclaw models set <model-ID> to switch instantly without restarting the gateway. Or configure a "model switch" Skill (SKILL.md) so AI understands natural language commands like "switch to M2.5."
Do I need to configure API Keys for all providers?
Only configure the providers you actually plan to use. If you're using MiniMax as primary with Claude as fallback, you only need those two keys. Unconfigured providers won't be called and won't cause errors.