CLI
Configuration
Write eval.yaml — servers, agents, scenarios, assertions, and auth.
Structure Overview
An eval file requires `agents` and `scenarios`.
Top-level `servers` exists for backward compatibility but is deprecated. Prefer scenario-owned `mcp_servers`.
agents:
- id: claude
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0
scenarios:
- id: basic-test
mcp_servers:
- id: my-server
transport: http
url: http://localhost:3000/mcp
prompt: Describe what you want the agent to do.
eval:
tool_constraints:
required_tools: [tool_name]Servers
Each server entry needs an id, a transport (http for HTTP/SSE), and the URL of the MCP endpoint.
If the endpoint requires a bearer token, add a token field. Use a literal string for a hardcoded value or a $ENV_VAR reference to read from the environment.
servers:
- id: my-server
transport: http
url: http://localhost:3000/mcp
token: "my-static-token" # literal value
- id: prod-server
transport: http
url: https://api.example.com/mcp
token: $SERVER_API_TOKEN # reads from envAgents
Each agent entry needs an id, a provider, and a model. Supported providers are anthropic, openai, and azure.
temperature defaults to 0. Lower values produce more deterministic results which is generally better for eval consistency.
agents:
- id: claude
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0
- id: gpt4o
provider: openai
model: gpt-4o
temperature: 0
- id: azure-gpt
provider: azure
model: gpt-4o
temperature: 0Scenarios
Inline scenarios define prompt/eval and may include `mcp_servers`.
Referenced scenarios can override only MCP target with `mcp_servers` while keeping test-case prompt/eval.
scenarios:
- ref: add-calculations
mcp_servers:
- ref: kpi-api-stageAssertions
Two types of assertions are available in the eval block.
- tool_constraints.required_tools — list of tool names the agent MUST call.
- tool_constraints.forbidden_tools — list of tool names the agent MUST NOT call.
- response_assertions type: regex — the agent response must match the regular expression in pattern.
- response_assertions type: jsonpath — evaluate JSON output at path and optionally match equals.
Reusable Refs
Use `ref` to reference library items from `agents.yaml`, `servers.yaml`, and `test-cases/`.
agents:
- ref: claude-sonnet-46
scenarios:
- ref: add-calculations
mcp_servers:
- ref: kpi-api-prodLibrary Files
A library is a directory of shared agents.yaml and servers.yaml files loaded by mcplab at startup. Library items are available to all eval configs without explicit $ref — you reference them by id.
Pass --libraries-dir when starting mcplab app to point it at a library directory. See the App / Library docs for managing library content through the UI.
agents:
- id: claude-haiku
provider: anthropic
model: claude-haiku-4-5-20251001
temperature: 0
- id: gpt4o-mini
provider: openai
model: gpt-4o-mini
temperature: 0# No agents block needed — claude-haiku comes from the library
scenarios:
- id: basic-test
agent: claude-haiku
servers: [my-server]
prompt: Complete the task.