What problem does this solve?
Routerly has no metrics export. Teams running standard observability stacks (Grafana, Datadog, New Relic, Prometheus Alertmanager) have no way to scrape Routerly's operational state — request throughput, latency percentiles, error rates, token consumption, budget headroom — without manually querying the dashboard API and building their own adapters. This makes Routerly difficult to integrate into an existing monitoring setup.
Proposed solution
1. Prometheus-compatible /metrics endpoint
Expose a /metrics endpoint (scrapeable by Prometheus or any compatible agent) with the following gauge/counter/histogram families:
| Metric |
Type |
Labels |
routerly_requests_total |
counter |
project, model, provider, status |
routerly_request_duration_seconds |
histogram |
project, model, provider |
routerly_tokens_total |
counter |
project, model, type (input/output) |
routerly_cost_usd_total |
counter |
project, model |
routerly_budget_used_ratio |
gauge |
project, limit_id |
routerly_provider_errors_total |
counter |
provider, model, error_type |
routerly_cache_hits_total |
counter |
project, cache_type |
routerly_routing_policy_used_total |
counter |
policy |
The endpoint should support optional bearer token auth (configurable) so it can be kept private in production.
2. OpenTelemetry (OTEL) trace export
For each request, emit an OTEL span carrying: project, model, provider, routing policy used, latency, token counts, cost, cache hit/miss. Export to a configurable OTLP endpoint (gRPC or HTTP).
{
"telemetry": {
"prometheus": { "enabled": true, "path": "/metrics", "authToken": "..." },
"otel": { "enabled": true, "endpoint": "http://otel-collector:4317" }
}
}
Alternatives you've considered
Polling the existing /api/usage endpoint from a custom exporter. Works but requires maintaining external glue code and does not expose real-time per-request latency.
Who would benefit from this?
Any team running Routerly in production with an existing Prometheus/Grafana or Datadog stack. This is a standard requirement for infrastructure components in enterprise environments.
Additional context
LiteLLM, Bifrost, and Kong all list Prometheus and OpenTelemetry support as first-class features. Bifrost specifically promotes sub-millisecond overhead with full OTEL tracing. A /metrics endpoint is also the most common ask from developers evaluating self-hosted gateways.
What problem does this solve?
Routerly has no metrics export. Teams running standard observability stacks (Grafana, Datadog, New Relic, Prometheus Alertmanager) have no way to scrape Routerly's operational state — request throughput, latency percentiles, error rates, token consumption, budget headroom — without manually querying the dashboard API and building their own adapters. This makes Routerly difficult to integrate into an existing monitoring setup.
Proposed solution
1. Prometheus-compatible
/metricsendpointExpose a
/metricsendpoint (scrapeable by Prometheus or any compatible agent) with the following gauge/counter/histogram families:routerly_requests_totalproject,model,provider,statusrouterly_request_duration_secondsproject,model,providerrouterly_tokens_totalproject,model,type(input/output)routerly_cost_usd_totalproject,modelrouterly_budget_used_ratioproject,limit_idrouterly_provider_errors_totalprovider,model,error_typerouterly_cache_hits_totalproject,cache_typerouterly_routing_policy_used_totalpolicyThe endpoint should support optional bearer token auth (configurable) so it can be kept private in production.
2. OpenTelemetry (OTEL) trace export
For each request, emit an OTEL span carrying: project, model, provider, routing policy used, latency, token counts, cost, cache hit/miss. Export to a configurable OTLP endpoint (gRPC or HTTP).
{ "telemetry": { "prometheus": { "enabled": true, "path": "/metrics", "authToken": "..." }, "otel": { "enabled": true, "endpoint": "http://otel-collector:4317" } } }Alternatives you've considered
Polling the existing
/api/usageendpoint from a custom exporter. Works but requires maintaining external glue code and does not expose real-time per-request latency.Who would benefit from this?
Any team running Routerly in production with an existing Prometheus/Grafana or Datadog stack. This is a standard requirement for infrastructure components in enterprise environments.
Additional context
LiteLLM, Bifrost, and Kong all list Prometheus and OpenTelemetry support as first-class features. Bifrost specifically promotes sub-millisecond overhead with full OTEL tracing. A
/metricsendpoint is also the most common ask from developers evaluating self-hosted gateways.