Frequently asked questions about RoutePlex.

Do you have official SDKs?

Yes! We have official SDKs for Python and Node.js — both are zero-dependency and support all API features.

bash

pip install routeplex          # Python 3.8+
npm install @routeplex/node    # Node.js 18+

python

from routeplex import RoutePlex

client = RoutePlex(api_key="rp_live_YOUR_KEY")
response = client.chat("Explain quantum computing")
print(response.output)

You can also use the OpenAI SDK by pointing it to our base URL, but the RoutePlex SDKs give you typed responses, error classes, and access to features like prompt enhancement, cost estimation, and test mode.

How does auto-routing work?

When you omit both model and strategy, RoutePlex analyzes your prompt — detecting intent, complexity, and domain — to automatically pick the best model. A simple question gets a fast, cheap model. A complex reasoning task gets a powerful one. No configuration needed.

You can optionally override with a strategy (cost, speed, quality, or balanced) to force a fixed priority instead of prompt-based selection.

What's the difference between auto-routing and strategy routing?

Auto-routing (default): RoutePlex reads your prompt and decides the best model — optimizing for the right balance of cost, quality, and speed based on what you're asking.
Strategy routing: You tell RoutePlex what to prioritize. strategy="cost" always picks the cheapest model, strategy="quality" always picks the most capable, etc.

Both are part of mode: "routeplex-ai". You can also use mode: "manual" to pick a specific model yourself.

What is prompt enhancement?

Prompt enhancement automatically rewrites your prompt before sending it to the model — adding structure, specificity, and context for better results. It's stateless (nothing stored), free, and adds no latency overhead. Enable it with enhance_prompt=True in your request.

What is test mode?

Test mode forces auto-routing to use only default-tier models, regardless of your plan or premium settings. Use it during development and in CI pipelines for predictable costs — you'll never accidentally route to a premium model.

python

response = client.chat("Write a test", test_mode=True)

test_mode only affects auto-routing. In manual mode you pick the model explicitly, so it has no effect.

Do you support streaming?

Yes. Pass stream=true and responses are delivered as Server-Sent Events in real time. Two modes are available:

Buffered (default) — smooth ~100ms paced chunks for polished UX
Realtime — minimal ~10ms buffering for lowest-latency delivery

python

# Python SDK
for event in client.chat_stream("Count to 5", stream_mode="realtime"):
    if event.type == "delta":
        print(event.content, end="", flush=True)

javascript

// Node.js SDK
const stream = await client.chatStream("Count to 5", { streamMode: "realtime" });
for await (const event of stream) {
  if (event.type === "delta") process.stdout.write(event.content);
}

Streaming works with both the native API and the OpenAI SDK-compatible endpoint.

What happens if a model fails?

RoutePlex automatically retries with fallback models. In auto-routing mode, fallbacks are selected based on your strategy. Having multiple providers behind one endpoint significantly improves reliability compared to depending on a single provider.

Can I see which model was used?

Yes. Every response includes model_used with the actual model that served your request (e.g. gpt-4.1-nano, claude-sonnet-4-20250514), along with the provider name.

How is pricing calculated?

You pay per token at each model's standard rate. Every response includes a cost breakdown so there are no surprises. Costs are tracked with micro-cent precision. You can see detailed usage and set spending caps in your dashboard.

What is the evaluation plan?

New accounts start on a free evaluation plan with conservative daily limits. This lets you try the API with real models at no cost. Upgrade to a usage plan when you're ready.

Why are the default limits so low?

Default limits are intentionally conservative to prevent abusive usage and protect the platform. If you need higher limits, contact support@routeplex.com and we'll increase them — we typically respond within 24 hours.

Does RoutePlex support web search?

Yes! RoutePlex automatically detects when your query needs real-time data (current events, live prices, recent news) and performs a web search before sending to the LLM. Just ask naturally — no extra parameters needed.

Can RoutePlex read URLs?

Yes. Include any URL in your message and RoutePlex will automatically fetch its content and provide it to the LLM for analysis, summarization, or Q&A.

Does RoutePlex store my prompts or responses?

No. RoutePlex is a fully stateless gateway — we do not store, log, or retain the content of your prompts, model responses, web search queries, or fetched URLs. Everything is processed in-memory and discarded immediately. See our Privacy Policy for full details.

What content is blocked by moderation?

RoutePlex blocks truly dangerous content: CSAM, weapons/explosives instructions, drug manufacturing guides, and terrorism. We intentionally do not block profanity, political discussion, adult content (non-illegal), educational content, or creative fiction. See our Acceptable Use Policy for full details.

Is the cost estimate endpoint free?

Yes. The /api/v1/chat/estimate endpoint is completely free and does not require authentication. Both SDKs expose it via client.estimate().

Is the enhance endpoint free?

Yes. The /api/v1/chat/enhance endpoint is completely free and does not require authentication. Both SDKs expose it via client.enhance().

Can I use premium models?

Premium models (like GPT-4o, Claude Sonnet) require a usage plan. You can enable them in your account settings and whitelist specific models you want to allow.

Is the API OpenAI-compatible?

Yes. You can use the OpenAI SDK by pointing it to our base URL. However, we recommend the official RoutePlex SDKs for the best experience — they support all features and provide typed error handling.

Will the underlying model change in auto-routing mode?

RoutePlex may update, replace, or rotate the models behind auto-routing at any time to improve quality, cost, or availability. If you need a specific model, use mode: "manual" with an explicit model ID.

How is billing handled?

Billing is account-level. All API requests are charged to the same account. Usage is tracked per-request and billed monthly through Stripe. You can set daily spending limits in the dashboard to prevent unexpected costs.