Hybrid Router - RouterArena Submission#148
Conversation
|
/evaluate |
9 similar comments
|
/evaluate |
|
/evaluate |
|
/evaluate |
|
/evaluate |
|
/evaluate |
|
/evaluate |
|
/evaluate |
|
/evaluate |
|
/evaluate |
Router Evaluation ResultsRouter: RouterArena Metrics
Optimality Metrics
Evaluation completed by RouterArena automated workflow |
|
Thanks @mikemao27 — the submission itself looks clean and reproduces on our side, so we're glad to add it. One request before we merge: please slim the PR down to the files a leaderboard submission actually needs. Per the README ("Submitting to the leaderboard"), a submission only requires:
Could you drop the rest from this PR?
For reference, a clean submission was just 5 files. Once trimmed, re-post |
|
I think the pre-commit errors for mypy are in llm_inference.pipeline.py. This is a base repository file that wasn't modified by this PR. So, it shouldn't have anything to do with my code. Thus, I'll ignore it. Past pre-commit checks succeeded for previous iterations of this PR. So, I'll continue with evaluation. Let me know if it's an error that needs checking (I believe other similar PRs also faced pre-commit errors or some sort, potentially different from mine). |
|
/evaluate |
Relevant Files
hybrid-router.json- regular predictionshybrid-router-robustness.json- robustness predictionshybrid-router.json(config) - Router configurationSubmission Steps
router_inference/predictions/hybrid-router.jsonrouter_inference/predictions/hybrid-router-robustness.jsonrouter_inference/config/hybrid-router.json/evaluateExpected Performance
qwen3-235b-a22b, ~3.4% toministral-3b, the remainder toqwen3-30b-a3b.