Routing Configuration
Configure how requests are routed to different models.
Default Routing
Set the default model for all requests:
{
"Router": {
"default": "deepseek,deepseek-chat"
}
}
Built-in Scenarios
Background Tasks
Route background tasks to a lightweight model:
{
"Router": {
"background": "groq,llama-3.3-70b-versatile"
}
}
Thinking Mode (Plan Mode)
Route thinking-intensive tasks to a more capable model:
{
"Router": {
"think": "deepseek,deepseek-chat"
}
}
Long Context
Route requests with long context:
{
"Router": {
"longContextThreshold": 100000,
"longContext": "gemini,gemini-1.5-pro"
}
}
Web Search
Route web search tasks:
{
"Router": {
"webSearch": "deepseek,deepseek-chat"
}
}
Image Tasks
Route image-related tasks:
{
"Router": {
"image": "gemini,gemini-1.5-pro"
}
}
Fallback
When a request fails, you can configure a list of backup models. The system will try each model in sequence until one succeeds:
Basic Configuration
{
"Router": {
"default": "deepseek,deepseek-chat",
"background": "ollama,qwen2.5-coder:latest",
"think": "deepseek,deepseek-reasoner",
"longContext": "openrouter,google/gemini-2.5-pro-preview",
"longContextThreshold": 60000,
"webSearch": "gemini,gemini-2.5-flash"
},
"fallback": {
"default": [
"aihubmix,Z/glm-4.5",
"openrouter,anthropic/claude-sonnet-4"
],
"background": [
"ollama,qwen2.5-coder:latest"
],
"think": [
"openrouter,anthropic/claude-3.7-sonnet:thinking"
],
"longContext": [
"modelscope,Qwen/Qwen3-Coder-480B-A35B-Instruct"
],
"webSearch": [
"openrouter,anthropic/claude-sonnet-4"
]
}
}
How It Works
- Trigger: When a model request fails for a routing scenario (HTTP error response)
- Auto-switch: The system automatically checks the fallback configuration for that scenario
- Sequential retry: Tries each backup model in order
- Success: Once a model responds successfully, returns immediately
- All failed: If all backup models fail, returns the original error
Configuration Details
- Format: Each backup model format is
provider,model - Validation: Backup models must exist in the
Providersconfiguration - Flexibility: Different scenarios can have different fallback lists
- Optional: If a scenario doesn't need fallback, omit it or use an empty array
Use Cases
Scenario 1: Primary Model Quota Exhausted
{
"Router": {
"default": "openrouter,anthropic/claude-sonnet-4"
},
"fallback": {
"default": [
"deepseek,deepseek-chat",
"aihubmix,Z/glm-4.5"
]
}
}
Automatically switches to backup models when the primary model quota is exhausted.
Scenario 2: Service Reliability
{
"Router": {
"background": "volcengine,deepseek-v3-250324"
},
"fallback": {
"background": [
"modelscope,Qwen/Qwen3-Coder-480B-A35B-Instruct",
"dashscope,qwen3-coder-plus"
]
}
}
Automatically switches to other providers when the primary service fails.
Log Monitoring
The system logs detailed fallback process:
[warn] Request failed for default, trying 2 fallback models
[info] Trying fallback model: aihubmix,Z/glm-4.5
[warn] Fallback model aihubmix,Z/glm-4.5 failed: API rate limit exceeded
[info] Trying fallback model: openrouter,anthropic/claude-sonnet-4
[info] Fallback model openrouter,anthropic/claude-sonnet-4 succeeded
Important Notes
- Cost consideration: Backup models may incur different costs, configure appropriately
- Performance differences: Different models may have varying response speeds and quality
- Quota management: Ensure backup models have sufficient quotas
- Testing: Regularly test the availability of backup models
Project-Level Routing
Configure routing per project in ~/.claude/projects/<project-id>/claude-code-router.json:
{
"Router": {
"default": "groq,llama-3.3-70b-versatile"
}
}
Project-level configuration takes precedence over global configuration.
Custom Router
Create a custom JavaScript router function:
- Create a router file (e.g.,
custom-router.js):
module.exports = function(config, context) {
// Analyze the request context
const { scenario, projectId, tokenCount } = context;
// Custom routing logic
if (scenario === 'background') {
return 'groq,llama-3.3-70b-versatile';
}
if (tokenCount > 100000) {
return 'gemini,gemini-1.5-pro';
}
// Default
return 'deepseek,deepseek-chat';
};
- Set the
CUSTOM_ROUTER_PATHenvironment variable:
export CUSTOM_ROUTER_PATH="/path/to/custom-router.js"
Token Counting
The router uses tiktoken (cl100k_base) to estimate request token count. This is used for:
- Determining if a request exceeds
longContextThreshold - Custom routing logic based on token count
Subagent Routing
Specify models for subagents using special tags:
<CCR-SUBAGENT-MODEL>provider,model</CCR-SUBAGENT-MODEL>
Please help me analyze this code...
Next Steps
- Transformers - Apply transformations to requests
- Custom Router - Advanced custom routing