Routing Configuration

Configure how requests are routed to different models.

Default Routing

Set the default model for all requests:

{
  "Router": {
    "default": "deepseek,deepseek-chat"
  }
}

Built-in Scenarios

Background Tasks

Route background tasks to a lightweight model:

{
  "Router": {
    "background": "groq,llama-3.3-70b-versatile"
  }
}

Thinking Mode (Plan Mode)

Route thinking-intensive tasks to a more capable model:

{
  "Router": {
    "think": "deepseek,deepseek-chat"
  }
}

Long Context

Route requests with long context:

{
  "Router": {
    "longContextThreshold": 100000,
    "longContext": "gemini,gemini-1.5-pro"
  }
}

Web Search

Route web search tasks:

{
  "Router": {
    "webSearch": "deepseek,deepseek-chat"
  }
}

Image Tasks

Route image-related tasks:

{
  "Router": {
    "image": "gemini,gemini-1.5-pro"
  }
}

Fallback

When a request fails, you can configure a list of backup models. The system will try each model in sequence until one succeeds:

Basic Configuration

{
  "Router": {
    "default": "deepseek,deepseek-chat",
    "background": "ollama,qwen2.5-coder:latest",
    "think": "deepseek,deepseek-reasoner",
    "longContext": "openrouter,google/gemini-2.5-pro-preview",
    "longContextThreshold": 60000,
    "webSearch": "gemini,gemini-2.5-flash"
  },
  "fallback": {
    "default": [
      "aihubmix,Z/glm-4.5",
      "openrouter,anthropic/claude-sonnet-4"
    ],
    "background": [
      "ollama,qwen2.5-coder:latest"
    ],
    "think": [
      "openrouter,anthropic/claude-3.7-sonnet:thinking"
    ],
    "longContext": [
      "modelscope,Qwen/Qwen3-Coder-480B-A35B-Instruct"
    ],
    "webSearch": [
      "openrouter,anthropic/claude-sonnet-4"
    ]
  }
}

How It Works

Trigger: When a model request fails for a routing scenario (HTTP error response)
Auto-switch: The system automatically checks the fallback configuration for that scenario
Sequential retry: Tries each backup model in order
Success: Once a model responds successfully, returns immediately
All failed: If all backup models fail, returns the original error

Configuration Details

Format: Each backup model format is provider,model
Validation: Backup models must exist in the Providers configuration
Flexibility: Different scenarios can have different fallback lists
Optional: If a scenario doesn't need fallback, omit it or use an empty array

Use Cases

Scenario 1: Primary Model Quota Exhausted

{
  "Router": {
    "default": "openrouter,anthropic/claude-sonnet-4"
  },
  "fallback": {
    "default": [
      "deepseek,deepseek-chat",
      "aihubmix,Z/glm-4.5"
    ]
  }
}

Automatically switches to backup models when the primary model quota is exhausted.

Scenario 2: Service Reliability

{
  "Router": {
    "background": "volcengine,deepseek-v3-250324"
  },
  "fallback": {
    "background": [
      "modelscope,Qwen/Qwen3-Coder-480B-A35B-Instruct",
      "dashscope,qwen3-coder-plus"
    ]
  }
}

Automatically switches to other providers when the primary service fails.

Log Monitoring

The system logs detailed fallback process:

[warn] Request failed for default, trying 2 fallback models
[info] Trying fallback model: aihubmix,Z/glm-4.5
[warn] Fallback model aihubmix,Z/glm-4.5 failed: API rate limit exceeded
[info] Trying fallback model: openrouter,anthropic/claude-sonnet-4
[info] Fallback model openrouter,anthropic/claude-sonnet-4 succeeded

Important Notes

Cost consideration: Backup models may incur different costs, configure appropriately
Performance differences: Different models may have varying response speeds and quality
Quota management: Ensure backup models have sufficient quotas
Testing: Regularly test the availability of backup models

Project-Level Routing

Configure routing per project in ~/.claude/projects/<project-id>/claude-code-router.json:

{
  "Router": {
    "default": "groq,llama-3.3-70b-versatile"
  }
}

Project-level configuration takes precedence over global configuration.

Custom Router

Create a custom JavaScript router function:

Create a router file (e.g., custom-router.js):

module.exports = function(config, context) {
  // Analyze the request context
  const { scenario, projectId, tokenCount } = context;

  // Custom routing logic
  if (scenario === 'background') {
    return 'groq,llama-3.3-70b-versatile';
  }

  if (tokenCount > 100000) {
    return 'gemini,gemini-1.5-pro';
  }

  // Default
  return 'deepseek,deepseek-chat';
};

Set the CUSTOM_ROUTER_PATH environment variable:

export CUSTOM_ROUTER_PATH="/path/to/custom-router.js"

Token Counting

The router uses tiktoken (cl100k_base) to estimate request token count. This is used for:

Determining if a request exceeds longContextThreshold
Custom routing logic based on token count

Subagent Routing

Specify models for subagents using special tags:

<CCR-SUBAGENT-MODEL>provider,model</CCR-SUBAGENT-MODEL>
Please help me analyze this code...

Next Steps

Transformers - Apply transformations to requests
Custom Router - Advanced custom routing

Default Routing​

Built-in Scenarios​

Background Tasks​

Thinking Mode (Plan Mode)​

Long Context​

Web Search​

Image Tasks​

Fallback​

Basic Configuration​

How It Works​

Configuration Details​

Use Cases​

Scenario 1: Primary Model Quota Exhausted​

Scenario 2: Service Reliability​

Log Monitoring​

Important Notes​

Project-Level Routing​

Custom Router​

Token Counting​

Subagent Routing​

Next Steps​