Real-world use cases and comprehensive code examples for common scenarios.
Official SDKs
RoutePlex provides official SDKs for Python and Node.js. Both are zero-dependency and support all API features.
pip install routeplex # Python
npm install @routeplex/node # Node.jsAuto-Routing (Default)
The default behavior — RoutePlex analyzes your prompt to pick the optimal model. A simple question gets a fast, cheap model. A complex reasoning task gets a capable one.
Python SDK
from routeplex import RoutePlex
client = RoutePlex(api_key="YOUR_API_KEY")
# Simple prompt → fast/cheap model
response = client.chat("What is JavaScript?")
print(f"{response.output} (model: {response.model_used})")
# Complex prompt → powerful model
response = client.chat("Prove the Riemann hypothesis approach")
print(f"{response.output} (model: {response.model_used})")Node.js SDK
import { RoutePlex } from "@routeplex/node";
const client = new RoutePlex({ apiKey: "YOUR_API_KEY" });
const response = await client.chat("What is JavaScript?");
console.log(response.output);
console.log(`Model: ${response.modelUsed}, Cost: $${response.usage.costUsd.toFixed(6)}`);Strategy Routing
Override auto-routing with a specific priority when you know what you want.
Cost Strategy — Minimize Expenses
Perfect for high-volume, simple tasks like data extraction, basic Q&A, or content moderation.
Python SDK
# Process thousands of simple queries cost-effectively
response = client.chat(
"Is this content appropriate? The weather is nice today.",
strategy="cost",
max_output_tokens=50,
)
print(response.output)Speed Strategy — Minimize Latency
Ideal for real-time applications like chatbots, live support, or interactive tools.
Node.js SDK
const response = await client.chat("Quick summary of quantum computing", {
strategy: "speed",
maxOutputTokens: 256,
});
console.log(response.output);Quality Strategy — Best Output
Use for complex reasoning, code generation, detailed analysis, or creative writing.
Python SDK
response = client.chat(
"Implement a red-black tree in Python with insert, delete, and search operations.",
strategy="quality",
max_output_tokens=2048,
temperature=0.3,
)
print(response.output)Balanced Strategy — Fixed Weights
Use balanced when you want a fixed trade-off between cost, speed, and quality instead of prompt-based auto-routing.
Python SDK
response = client.chat("General task", strategy="balanced")Manual Mode — Specific Model
Choose a specific model. RoutePlex automatically handles fallbacks if the primary model fails.
Python SDK
response = client.chat(
"Explain recursion",
model="gpt-4o-mini",
max_output_tokens=1024,
temperature=0.5,
)
print(f"Used model: {response.model_used}")
print(f"Cost: ${response.usage.cost_usd:.6f}")Node.js SDK
const response = await client.chat("Explain recursion", {
model: "gpt-4o-mini",
maxOutputTokens: 1024,
temperature: 0.5,
});
console.log(`Used model: ${response.modelUsed}`);Streaming
Stream responses in real time as tokens are generated. Two modes available: buffered (default, smooth sentence-aware chunks) and realtime (minimal-latency character delivery).
Python SDK
for event in client.chat_stream("Explain how streaming works"):
if event.type == "delta":
print(event.content, end="", flush=True)
elif event.type == "done":
print(f"\nModel: {event.model_used}, Tokens: {event.usage['total_tokens']}")Node.js SDK
for await (const event of client.chatStream("Explain how streaming works")) {
if (event.type === "delta") process.stdout.write(event.content);
if (event.type === "done") console.log(`\nModel: ${event.modelUsed}`);
}Realtime mode (minimal buffering):
for event in client.chat_stream("Hello!", stream_mode="realtime"):
if event.type == "delta":
print(event.content, end="", flush=True)OpenAI SDK streaming:
from openai import OpenAI
client = OpenAI(api_key="YOUR_KEY", base_url="https://api.routeplex.com/v1")
stream = client.chat.completions.create(
model="routeplex-ai",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Multi-Turn Conversations
Python SDK
response = client.chat([
{"role": "system", "content": "You are a helpful tutor."},
{"role": "user", "content": "What is recursion?"},
{"role": "assistant", "content": "Recursion is when a function calls itself..."},
{"role": "user", "content": "Can you give me a Python example?"},
])
print(response.output)Node.js SDK
const response = await client.chat([
{ role: "system", content: "You are a helpful tutor." },
{ role: "user", content: "What is recursion?" },
{ role: "assistant", content: "Recursion is when a function calls itself..." },
{ role: "user", content: "Can you give me a JS example?" },
]);
console.log(response.output);Free Endpoints
These endpoints are free and don't require an API key.
Cost Estimation
estimate = client.estimate("Write a blog post about AI")
print(f"Model: {estimate.model}")
print(f"Estimated cost: ${estimate.estimated_cost_usd:.6f}")
print(f"Confidence: {estimate.confidence}")const estimate = await client.estimate("Write a blog post about AI");
console.log(`Model: ${estimate.model}, Cost: $${estimate.estimatedCostUsd.toFixed(6)}`);Prompt Enhancement
result = client.enhance("tell me about kubernetes")
if result.changed:
print(f"Enhanced: {result.enhanced_prompt}")
print(f"Type: {result.query_type}")const result = await client.enhance("tell me about kubernetes");
if (result.changed) {
console.log(`Enhanced: ${result.enhancedPrompt}`);
console.log(`Type: ${result.queryType}`);
}List Models
models = client.list_models()
for m in models:
print(f"{m.id} ({m.provider}) — {m.tier}")const models = await client.listModels();
models.forEach((m) => console.log(`${m.id} (${m.provider}) — ${m.tier}`));Error Handling
Python SDK
from routeplex import RoutePlex, AuthenticationError, RateLimitError
client = RoutePlex(api_key="rp_live_YOUR_KEY")
try:
response = client.chat("Hello!")
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
print(f"Rate limited: {e.message}")Node.js SDK
import { RoutePlex, AuthenticationError, RateLimitError } from "@routeplex/node";
try {
const response = await client.chat("Hello!");
} catch (err) {
if (err instanceof AuthenticationError) {
console.log("Invalid API key");
} else if (err instanceof RateLimitError) {
console.log(`Rate limited: ${err.message}`);
}
}Test Mode
Use test_mode during development and CI to prevent auto-routing from selecting premium models.
Python SDK
# Safe for CI pipelines — will never route to premium models
response = client.chat("Write a unit test for this function.", test_mode=True)Node.js SDK
const response = await client.chat("Write a unit test", { testMode: true });OpenAI SDK Compatible
Use the OpenAI SDK with RoutePlex as a drop-in replacement.
Python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_ROUTEPLEX_API_KEY",
base_url="https://api.routeplex.com/v1"
)
response = client.chat.completions.create(
model="routeplex-ai", # Auto-routing
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)Node.js
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_ROUTEPLEX_API_KEY",
baseURL: "https://api.routeplex.com/v1",
});
const response = await client.chat.completions.create({
model: "routeplex-ai",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);Prompt Enhancement
RoutePlex can automatically rewrite your prompt before it reaches the model. Add enhance_prompt=True to any request — it's stateless, free, and adds no latency overhead.
Python SDK
response = client.chat("fix my code", enhance_prompt=True)
print(response.output)Standalone Enhance Endpoint (No API Key)
curl -X POST https://api.routeplex.com/api/v1/chat/enhance \
-H "Content-Type: application/json" \
-d '{"prompt": "compare react and vue"}'Self-Learning: Submitting Feedback
Rate individual responses to accelerate routing personalization. User ratings are blended with the automatic quality signals and immediately update routing preferences for that query type — user feedback carries extra weight, so a few ratings go a long way.
Rate a Response (curl)
curl -X POST https://api.routeplex.com/api/v1/insights/feedback \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"request_id": "req_7f8a9b2c3d4e5f6g",
"score": 5,
"is_helpful": true
}'Built-in Web Search & URL Fetching
RoutePlex automatically detects when your query needs real-time data and fetches it before sending to the LLM. No extra setup required — just ask naturally.
Python SDK
# Ask about current events — web search happens automatically
response = client.chat("What is the current price of Bitcoin today?")
print(response.output)Node.js SDK
// Summarize a webpage — content is fetched automatically
const response = await client.chat(
"Summarize this article: https://example.com/blog/ai-trends-2026"
);
console.log(response.output);Note: Web search and URL fetching costs are included in the standard per-request billing. There are no additional charges — the total
cost_usdin the response reflects everything including search costs.