GatewayBench v1
Moda Labs · 2025 · MIT
A synthetic benchmark dataset for evaluating LLM gateway systems and routing decisions. Provides 2,000 test cases with ground truth labels across four distinct task types: tool selection, retrieval, chat, and stress testing.
2,000
Examples
4
Task Types
10+
Domains
40
Avg Tools
LLMBenchmarkRoutingTool-CallingSyntheticEvaluation