Real-world use cases and comprehensive code examples for common scenarios.

Official SDKs

RoutePlex provides official SDKs for Python and Node.js. Both are zero-dependency and support all API features.

bash

pip install routeplex          # Python
npm install @routeplex/node    # Node.js

Auto-Routing (Default)

The default behavior — RoutePlex analyzes your prompt to pick the optimal model. A simple question gets a fast, cheap model. A complex reasoning task gets a capable one.

Python SDK

python

from routeplex import RoutePlex

client = RoutePlex(api_key="YOUR_API_KEY")

# Simple prompt → fast/cheap model
response = client.chat("What is JavaScript?")
print(f"{response.output} (model: {response.model_used})")

# Complex prompt → powerful model
response = client.chat("Prove the Riemann hypothesis approach")
print(f"{response.output} (model: {response.model_used})")

Node.js SDK

javascript

import { RoutePlex } from "@routeplex/node";

const client = new RoutePlex({ apiKey: "YOUR_API_KEY" });

const response = await client.chat("What is JavaScript?");
console.log(response.output);
console.log(`Model: ${response.modelUsed}, Cost: $${response.usage.costUsd.toFixed(6)}`);

Strategy Routing

Override auto-routing with a specific priority when you know what you want.

Cost Strategy — Minimize Expenses

Perfect for high-volume, simple tasks like data extraction, basic Q&A, or content moderation.

Python SDK

python

# Process thousands of simple queries cost-effectively
response = client.chat(
    "Is this content appropriate? The weather is nice today.",
    strategy="cost",
    max_output_tokens=50,
)
print(response.output)

Speed Strategy — Minimize Latency

Ideal for real-time applications like chatbots, live support, or interactive tools.

Node.js SDK

javascript

const response = await client.chat("Quick summary of quantum computing", {
  strategy: "speed",
  maxOutputTokens: 256,
});
console.log(response.output);

Quality Strategy — Best Output

Use for complex reasoning, code generation, detailed analysis, or creative writing.

Python SDK

python

response = client.chat(
    "Implement a red-black tree in Python with insert, delete, and search operations.",
    strategy="quality",
    max_output_tokens=2048,
    temperature=0.3,
)
print(response.output)

Balanced Strategy — Fixed Weights

Use balanced when you want a fixed trade-off between cost, speed, and quality instead of prompt-based auto-routing.

Python SDK

python

response = client.chat("General task", strategy="balanced")

Manual Mode — Specific Model

Choose a specific model. RoutePlex automatically handles fallbacks if the primary model fails.

Python SDK

python

response = client.chat(
    "Explain recursion",
    model="gpt-4o-mini",
    max_output_tokens=1024,
    temperature=0.5,
)
print(f"Used model: {response.model_used}")
print(f"Cost: ${response.usage.cost_usd:.6f}")

Node.js SDK

javascript

const response = await client.chat("Explain recursion", {
  model: "gpt-4o-mini",
  maxOutputTokens: 1024,
  temperature: 0.5,
});
console.log(`Used model: ${response.modelUsed}`);

Streaming

Stream responses in real time as tokens are generated. Two modes available: buffered (default, smooth sentence-aware chunks) and realtime (minimal-latency character delivery).

Python SDK

python

for event in client.chat_stream("Explain how streaming works"):
    if event.type == "delta":
        print(event.content, end="", flush=True)
    elif event.type == "done":
        print(f"\nModel: {event.model_used}, Tokens: {event.usage['total_tokens']}")

Node.js SDK

javascript

for await (const event of client.chatStream("Explain how streaming works")) {
  if (event.type === "delta") process.stdout.write(event.content);
  if (event.type === "done") console.log(`\nModel: ${event.modelUsed}`);
}

Realtime mode (minimal buffering):

python

for event in client.chat_stream("Hello!", stream_mode="realtime"):
    if event.type == "delta":
        print(event.content, end="", flush=True)

OpenAI SDK streaming:

python

from openai import OpenAI

client = OpenAI(api_key="YOUR_KEY", base_url="https://api.routeplex.com/v1")

stream = client.chat.completions.create(
    model="routeplex-ai",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Multi-Turn Conversations

Python SDK

python

response = client.chat([
    {"role": "system", "content": "You are a helpful tutor."},
    {"role": "user", "content": "What is recursion?"},
    {"role": "assistant", "content": "Recursion is when a function calls itself..."},
    {"role": "user", "content": "Can you give me a Python example?"},
])
print(response.output)

Node.js SDK

javascript

const response = await client.chat([
  { role: "system", content: "You are a helpful tutor." },
  { role: "user", content: "What is recursion?" },
  { role: "assistant", content: "Recursion is when a function calls itself..." },
  { role: "user", content: "Can you give me a JS example?" },
]);
console.log(response.output);

Free Endpoints

These endpoints are free and don't require an API key.

Cost Estimation

python

estimate = client.estimate("Write a blog post about AI")
print(f"Model: {estimate.model}")
print(f"Estimated cost: ${estimate.estimated_cost_usd:.6f}")
print(f"Confidence: {estimate.confidence}")

javascript

const estimate = await client.estimate("Write a blog post about AI");
console.log(`Model: ${estimate.model}, Cost: $${estimate.estimatedCostUsd.toFixed(6)}`);

Prompt Enhancement

python

result = client.enhance("tell me about kubernetes")
if result.changed:
    print(f"Enhanced: {result.enhanced_prompt}")
    print(f"Type: {result.query_type}")

javascript

const result = await client.enhance("tell me about kubernetes");
if (result.changed) {
  console.log(`Enhanced: ${result.enhancedPrompt}`);
  console.log(`Type: ${result.queryType}`);
}

List Models

python

models = client.list_models()
for m in models:
    print(f"{m.id} ({m.provider}) — {m.tier}")

javascript

const models = await client.listModels();
models.forEach((m) => console.log(`${m.id} (${m.provider}) — ${m.tier}`));

Error Handling

Python SDK

python

from routeplex import RoutePlex, AuthenticationError, RateLimitError

client = RoutePlex(api_key="rp_live_YOUR_KEY")

try:
    response = client.chat("Hello!")
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited: {e.message}")

Node.js SDK

javascript

import { RoutePlex, AuthenticationError, RateLimitError } from "@routeplex/node";

try {
  const response = await client.chat("Hello!");
} catch (err) {
  if (err instanceof AuthenticationError) {
    console.log("Invalid API key");
  } else if (err instanceof RateLimitError) {
    console.log(`Rate limited: ${err.message}`);
  }
}

Test Mode

Use test_mode during development and CI to prevent auto-routing from selecting premium models.

Python SDK

python

# Safe for CI pipelines — will never route to premium models
response = client.chat("Write a unit test for this function.", test_mode=True)

Node.js SDK

javascript

const response = await client.chat("Write a unit test", { testMode: true });

OpenAI SDK Compatible

Use the OpenAI SDK with RoutePlex as a drop-in replacement.

Python

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_ROUTEPLEX_API_KEY",
    base_url="https://api.routeplex.com/v1"
)

response = client.chat.completions.create(
    model="routeplex-ai",  # Auto-routing
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Node.js

javascript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_ROUTEPLEX_API_KEY",
  baseURL: "https://api.routeplex.com/v1",
});

const response = await client.chat.completions.create({
  model: "routeplex-ai",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

Prompt Enhancement

RoutePlex can automatically rewrite your prompt before it reaches the model. Add enhance_prompt=True to any request — it's stateless, free, and adds no latency overhead.

Python SDK

python

response = client.chat("fix my code", enhance_prompt=True)
print(response.output)

Standalone Enhance Endpoint (No API Key)

bash

curl -X POST https://api.routeplex.com/api/v1/chat/enhance \
  -H "Content-Type: application/json" \
  -d '{"prompt": "compare react and vue"}'

Self-Learning: Submitting Feedback

Rate individual responses to accelerate routing personalization. User ratings are blended with the automatic quality signals and immediately update routing preferences for that query type — user feedback carries extra weight, so a few ratings go a long way.

Rate a Response (curl)

bash

curl -X POST https://api.routeplex.com/api/v1/insights/feedback \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "request_id": "req_7f8a9b2c3d4e5f6g",
    "score": 5,
    "is_helpful": true
  }'

Built-in Web Search & URL Fetching

RoutePlex automatically detects when your query needs real-time data and fetches it before sending to the LLM. No extra setup required — just ask naturally.

Python SDK

python

# Ask about current events — web search happens automatically
response = client.chat("What is the current price of Bitcoin today?")
print(response.output)

Node.js SDK

javascript

// Summarize a webpage — content is fetched automatically
const response = await client.chat(
  "Summarize this article: https://example.com/blog/ai-trends-2026"
);
console.log(response.output);

Note: Web search and URL fetching costs are included in the standard per-request billing. There are no additional charges — the total cost_usd in the response reflects everything including search costs.