Changelog

Product updates, new models, and platform changes

v1.3.0LatestMarch 16, 2026

Security Hardening & Auto Routing

Security Hardening & Auto Routing

Platform-wide security improvements and a new deep-dive on auto routing.

Security

  • Registration protection — Automated sign-up and bot abuse prevention.
  • Login hardening — Account lockout after repeated failed attempts with unified error responses.
  • Session management — Active session limits with automatic cleanup.
  • Rate limit improvements — Stricter limits on authentication endpoints with abuse prevention on the frontend.
  • Database optimization — Improved query performance for high-volume accounts.
  • Webhook reliability — Duplicate event detection for payment processing.

Auto Routing

  • 3-stage pipeline — Every routeplex-ai request flows through Analyze → Score → Route with minimal overhead.
  • 4 routing strategiesbalanced (default), quality, cost, and speed — pass your preference via extra_body or the X-RoutePlex-Strategy header.
  • Automatic fallback — If the selected model fails, the router transparently retries with the next-best option.
  • Works with Self-Learning — Auto routing gets smarter over time when paired with Self-Learning Routing.
  • New blog postAuto Routing: One Model String, Every Provider — deep dive into the pipeline, strategies, and code examples.
v1.2.0March 1, 2026

Prompt Enhancement & Self-Learning Routing

Prompt Enhancement & Self-Learning Routing

Two intelligence features that make every request better — automatically.

Prompt Enhancement (Phase 4.1)

  • Automatic prompt rewriting — RoutePlex detects your query type and rewrites your prompt before it reaches the model. No extra API calls, no stored data, no configuration required.
  • 17 query sub-types — Specialized rewriting for code, debugging, analysis, writing, math, creative, planning, comparison, and more. Each type gets targeted improvements (language context, step-by-step framing, structure hints, etc.).
  • Complexity gating — Detailed prompts (30+ words with multiple specificity signals) are sent unchanged. Enhancement only fires when it would meaningfully help.
  • 3 ways to enable — Per request via enhance_prompt: true, via the OpenAI SDK with the X-RoutePlex-Enhance: true header, or via the standalone /api/v1/chat/enhance endpoint (no API key required, free).
  • Stateless & private — Your prompt is processed in memory and immediately discarded. Nothing is stored. See the Prompt Enhancement docs.

Self-Learning Routing (Phase 4.4)

  • Per-account performance profiles — After each successful request, RoutePlex records lightweight metadata (model used, query type, response quality signals, latency, cost). Over time this builds a personalized routing profile for your account.
  • Learned routing bias — The router applies a per-model bias (±15 points) based on historical performance for each query type. Models that consistently produce better results for your workload get priority. No configuration needed.
  • Confidence gating — Learning bias only influences routing when there's enough data to trust it: 20% weight at 10–50 requests, 50% at 50–100, 80% beyond 100 requests per model. Below the threshold, global aggregated patterns apply.
  • Explicit feedback — Rate individual responses (1–5 stars) via the Insights dashboard or the POST /api/v1/insights/feedback API. Ratings are blended 60/40 with automatic quality signals.
  • Insights dashboard — New Insights tab in the dashboard shows per-model performance by query type, prompt enhancement effectiveness, routing influence stats, and cost optimization recommendations.
  • Full data control — All learning data belongs to you. Delete it at any time from dashboard settings or via DELETE /api/v1/insights/data. Routing immediately reverts to global defaults. See the Self-Learning docs.

Privacy

Both features are built with privacy as a constraint, not an afterthought:

  • Prompts and responses are never stored
  • Quality is inferred from response metadata (length, structure, token ratio) — not content
  • Message pattern detection uses one-way SHA-256 hashing
  • Learning data deletion is immediate and permanent
v1.0.0February 10, 2026

Launch: RoutePlex Public Beta

RoutePlex Public Beta

We are excited to launch RoutePlex into public beta! Here is everything included in this release:

Core Platform

  • Unified API Gateway — Single endpoint for 23 AI models across OpenAI, Anthropic, and Google Gemini
  • Intelligent Routing — Automatic model selection with routeplex-ai mode and 4 strategies (balanced, quality, cost, speed)
  • OpenAI SDK Compatibility — Use the official OpenAI SDK with any model — just change the base_url
  • Automatic Failover — Multi-provider retry logic for 99.9%+ effective uptime
  • Content Moderation — Built-in 3-layer safety pipeline (pattern detection → AI classification → URL blocklist) on all requests
  • Web Search — Automatic web search, auto-detected from prompt content — no extra parameters needed
  • URL Fetching — Automatic URL content extraction when URLs are detected in prompts
  • Stateless Architecture — Zero prompts stored, zero responses logged. Complete data privacy by design

Models

  • OpenAI — GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, GPT-4o, GPT-4o Mini, o3, o3 Mini, o4 Mini
  • Anthropic — Claude Opus 4.6, Claude Opus 4, Claude Sonnet 4.5, Claude Sonnet 4, Claude Haiku 4.5, Claude 3 Haiku
  • Google — Gemini 3 Pro (Preview), Gemini 3 Flash (Preview), Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite
  • RoutePlexrouteplex-ai intelligent routing with strategy selection

Billing & Usage

  • Micro-cent Billing — Usage tracked to the micro-cent level with real-time cost display
  • Evaluation Plan — Free tier with daily limits for testing and prototyping
  • Pay-as-you-go — Upgrade instantly with Stripe-powered billing
  • Cost Estimation — Free /chat/estimate endpoint for pre-request cost checks
  • Budget Controls — Daily token caps, cost limits, and monthly soft caps
  • Grace Period — 1-hour grace period when payment methods are removed
  • Invoice History — Detailed billing history with downloadable invoices
  • 30-day Billing Cycles — Automatic billing with usage-based invoicing

Dashboard

  • Real-time Analytics — Live request, token, and cost monitoring with hourly/daily/monthly views
  • API Key Management — Create, rotate, and revoke keys with per-key usage tracking
  • Model Settings — Enable/disable models and control premium model access
  • Model Health — Real-time health status, latency, and deprecation warnings for all models
  • Billing Management — Manage payment methods, view invoices, and track spending
  • Account Settings — Theme preferences, notification settings, and security controls
  • Team Management — Invite members with role-based access (owner, admin, member)
  • Responsive Design — Full mobile and desktop dashboard support

Developer Experience

  • Interactive Playground — Test all models and routing modes in-browser with cost estimates
  • Comprehensive Docs — Quick start guides, API reference, and integration examples
  • Error Codes — Standardized error responses with actionable messages
  • Rate Limiting — Per-account RPM and concurrency controls with clear headers
v0.9.0February 5, 2026

Model Health, Security & Performance

Model Health, Security & Performance

New Features

  • Model Health Dashboard — Real-time health status, latency percentiles, and deprecation warnings for all models
  • Grace Period Billing — 1-hour grace period when removing payment methods to prevent accidental lockouts
  • Content Moderation Pipeline — 3-layer safety system: pattern detection, AI classification, and URL domain blocklist
  • Web Search Integration — Automatic web search, triggered by prompt content (no extra params needed)
  • URL Content Fetching — Auto-extracts content from URLs detected in prompts

Improvements

  • 40% faster routing — Model selection engine optimized with in-memory caching, reducing routing overhead significantly
  • API key security — All API keys now stored with SHA-256 hashing. Keys are never stored in plaintext
  • Enhanced rate limiting — Per-account rate limiting with configurable RPM (requests per minute) and concurrency limits
  • Better error messages — Structured error responses with provider-specific error mapping and actionable suggestions

Bug Fixes

  • Fixed token counting accuracy for multi-byte Unicode characters
  • Resolved billing period boundary calculation for UTC edge cases
  • Fixed model health status not updating when provider returns 503
v0.8.0January 20, 2026

Alpha Release & Core Infrastructure

Alpha Release & Core Infrastructure

The foundational release of RoutePlex, establishing the core AI gateway infrastructure.

Initial Platform

  • Chat Completions API — OpenAI-compatible endpoint for multi-model routing with full request/response compatibility
  • Direct Mode — Specify exact models with provider prefix (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514)
  • RoutePlex AI Mode — Automatic model selection based on request characteristics and selected strategy
  • Evaluation Plan — Free tier with daily quotas for testing and prototyping
  • Dashboard — Usage monitoring, API key management, and basic analytics

Supported Models at Launch

  • OpenAI — GPT-4o, GPT-4o Mini, o3, o3 Mini
  • Anthropic — Claude Sonnet 4, Claude 3 Haiku
  • Google — Gemini 2.5 Pro, Gemini 2.5 Flash

Note: The model catalog has since been significantly expanded. See the current model list for up-to-date availability and pricing.

Infrastructure

  • Multi-region deployment for low-latency global access
  • Persistent data storage with automatic backups
  • In-memory rate limiting, caching, and session management
  • End-to-end TLS encryption on all API traffic
  • Zero-downtime database migrations