Docs/Smart Routing

Smart Routing

Auto-select the best model per request across OpenAI, Anthropic, and Google — no model ID required.

Smart Routing picks the optimal model for every request. When you enable it, RoutePlex inspects the prompt and selects the best model for the task — you don't pick a model ID.

Enable It

Native API — set mode: "routeplex-ai":

json
{
  "mode": "routeplex-ai",
  "messages": [{"role": "user", "content": "Your prompt"}]
}

OpenAI-compatible API — set model: "routeplex-ai":

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.routeplex.com/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="routeplex-ai",
    messages=[{"role": "user", "content": "Explain DNS"}],
)

Official SDKs — omit the model parameter:

python
response = client.chat("Your prompt")
javascript
const res = await client.chat("Your prompt");

Strategies

Smart Routing supports five strategies that steer the selection toward different priorities:

StrategyOptimizes for
auto (default)Adaptive — selection tuned to the specific prompt
balancedWell-rounded trade-off across cost, quality, and speed
costMinimum cost per request
qualityBest available model for the task
speedLowest response latency

auto is the default and recommended for most workloads — it adapts to each request rather than applying a fixed bias. Use the other strategies when you want consistent behavior for a specific use case (e.g. cost for high-volume classification, speed for chat UIs).

Passing a Strategy

Native API — top-level strategy field:

json
{
  "mode": "routeplex-ai",
  "strategy": "quality",
  "messages": [{"role": "user", "content": "Review this code for bugs"}]
}

OpenAI-compatibleX-RoutePlex-Strategy header:

python
response = client.chat.completions.create(
    model="routeplex-ai",
    messages=[{"role": "user", "content": "Review this code"}],
    extra_headers={"X-RoutePlex-Strategy": "quality"},
)

SDKsstrategy parameter:

python
response = client.chat("Analyze this dataset", strategy="quality")
javascript
await client.chat("Classify sentiment", { strategy: "cost" });

Fallback Behavior

When the selected model fails (5xx, timeout, or provider unavailable), RoutePlex automatically tries the next-best model from the candidate pool. Fallback is transparent — your application receives one response either way.

Non-retriable errors (invalid API key, invalid model, content policy) fail fast without fallback.

Response Fields

Every chat response includes which model handled the request so you can audit decisions in your logs:

json
{
  "success": true,
  "data": {
    "id": "req_abc123",
    "output": "Hello! How can I help?",
    "model_used": "gpt-4.1-nano",
    "usage": {
      "input_tokens": 10,
      "output_tokens": 8,
      "total_tokens": 18,
      "cost_usd": 0.000027
    }
  },
  "meta": {
    "request_id": "req_abc123",
    "timestamp": "2026-04-24T10:00:00Z"
  }
}

Opt Out Per-Request

Pin a specific model for a request by using manual mode. In the native API, set mode: "manual" with a model:

json
{
  "mode": "manual",
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "Hello!"}]
}

With the SDKs, pass the model parameter directly:

python
response = client.chat("Hello!", model="gpt-4o")

See Routing Modes for the auto-vs-manual comparison and Self-Learning Routing for how auto-selection adapts to your workload over time.

RoutePlex AI
Always online

Good evening!

Ask me anything about RoutePlex — APIs, SDKs, pricing, or setup.

Common questions

Powered by RoutePlex