CLI
Configuration
Write eval.yaml — servers, agents, scenarios, assertions, and auth.
Structure Overview
An eval file has three required top-level keys: servers, agents, and scenarios.
servers:
- id: my-server
transport: http
url: http://localhost:3000/mcp
agents:
- id: claude
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0
scenarios:
- id: basic-test
servers: [my-server]
prompt: Describe what you want the agent to do.
eval:
tool_constraints:
required_tools: [tool_name]Servers
Each server entry needs an id, a transport (http for HTTP/SSE), and the URL of the MCP endpoint.
If the endpoint requires a bearer token, add a token field. Use a literal string for a hardcoded value or a $ENV_VAR reference to read from the environment.
servers:
- id: my-server
transport: http
url: http://localhost:3000/mcp
token: "my-static-token" # literal value
- id: prod-server
transport: http
url: https://api.example.com/mcp
token: $SERVER_API_TOKEN # reads from envAgents
Each agent entry needs an id, a provider, and a model. Supported providers are anthropic, openai, and azure.
temperature defaults to 0. Lower values produce more deterministic results which is generally better for eval consistency.
agents:
- id: claude
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0
- id: gpt4o
provider: openai
model: gpt-4o
temperature: 0
- id: azure-gpt
provider: azure
model: gpt-4o
temperature: 0Scenarios
Each scenario has an id, a list of servers to give the agent access to, a prompt describing the task, and an eval block with assertions.
The agent field is optional — when omitted all agents in the config run the scenario.
scenarios:
- id: weather-lookup
servers: [weather-server]
prompt: What is the current weather in Amsterdam?
eval:
tool_constraints:
required_tools: [get_weather]
forbidden_tools: [send_email]
response_assertions:
- type: regex
pattern: "Amsterdam"
- type: contains
value: "temperature"Assertions
Two types of assertions are available in the eval block.
- tool_constraints.required_tools — list of tool names the agent MUST call.
- tool_constraints.forbidden_tools — list of tool names the agent MUST NOT call.
- response_assertions type: regex — the agent response must match the regular expression in pattern.
- response_assertions type: contains — the agent response must contain the exact string in value.
Reusable Refs
Use $ref to reference a server or agent definition from a separate file instead of repeating it across configs.
servers:
- id: my-server
transport: http
url: http://localhost:3000/mcpservers:
- $ref: servers.yaml#my-server
agents:
- id: claude
provider: anthropic
model: claude-haiku-4-5-20251001
scenarios:
- id: basic-test
servers: [my-server]
prompt: Complete the task.Library Files
A library is a directory of shared agents.yaml and servers.yaml files loaded by mcplab at startup. Library items are available to all eval configs without explicit $ref — you reference them by id.
Pass --libraries-dir when starting mcplab app to point it at a library directory. See the App / Library docs for managing library content through the UI.
agents:
- id: claude-haiku
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0
- id: gpt4o-mini
provider: openai
model: gpt-4o-mini
temperature: 0# No agents block needed — claude-haiku comes from the library
scenarios:
- id: basic-test
agent: claude-haiku
servers: [my-server]
prompt: Complete the task.