MODA

Research

Datasets & Publications

Open resources for advancing LLM infrastructure research.

GatewayBench v1

Moda Labs · 2025 · MIT

A synthetic benchmark dataset for evaluating LLM gateway systems and routing decisions. Provides 2,000 test cases with ground truth labels across four distinct task types: tool selection, retrieval, chat, and stress testing.

2,000
Examples
4
Task Types
10+
Domains
40
Avg Tools
LLMBenchmarkRoutingTool-CallingSyntheticEvaluation

Collaborate with us

Interested in contributing to GatewayBench or collaborating on LLM infrastructure research? We welcome feedback, benchmark results, and contributions.

Get in Touch