CLI

Configuration

Write eval.yaml — servers, agents, scenarios, assertions, and auth.

Structure Overview

An eval file requires agents and scenarios.

Top-level servers exists for backward compatibility but is deprecated. Prefer scenario-owned mcp_servers.

eval.yaml skeleton

agents:
  - id: claude
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0

scenarios:
  - id: basic-test
    mcp_servers:
      - id: my-server
        transport: http
        url: http://localhost:3000/mcp
    prompt: Describe what you want the agent to do.
    eval:
      tool_constraints:
        required_tools: [tool_name]

Servers

Each server entry needs an id, a transport (http for HTTP/SSE), and the URL of the MCP endpoint.

If the endpoint requires a bearer token, add a token field. Use a literal string for a hardcoded value or a $ENV_VAR reference to read from the environment.

server with bearer token

servers:
  - id: my-server
    transport: http
    url: http://localhost:3000/mcp
    token: "my-static-token"          # literal value

  - id: prod-server
    transport: http
    url: https://api.example.com/mcp
    token: $SERVER_API_TOKEN           # reads from env

Agents

Each agent entry needs an id, a provider, and a model. Supported providers are anthropic, openai, and azure.

temperature defaults to 0. Lower values produce more deterministic results which is generally better for eval consistency.

agents

agents:
  - id: claude
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0

  - id: gpt4o
    provider: openai
    model: gpt-4o
    temperature: 0

  - id: azure-gpt
    provider: azure
    model: gpt-4o
    temperature: 0

Scenarios

Inline scenarios define prompt/eval and may include mcp_servers.

Referenced scenarios can override only MCP target with mcp_servers while keeping test-case prompt/eval.

referenced scenario with mcp override

scenarios:
  - ref: add-calculations
    mcp_servers:
      - ref: kpi-api-stage

Assertions

Two types of assertions are available in the eval block.

tool_constraints.required_tools — list of tool names the agent MUST call.
tool_constraints.forbidden_tools — list of tool names the agent MUST NOT call.
response_assertions type: regex — the agent response must match the regular expression in pattern.
response_assertions type: jsonpath — evaluate JSON output at path and optionally match equals.

Reusable Refs

Use ref to reference library items from agents.yaml, servers.yaml, and test-cases/.

library refs in eval

agents:
  - ref: claude-sonnet-46

scenarios:
  - ref: add-calculations
    mcp_servers:
      - ref: kpi-api-prod

Library Files

A library is a directory of shared agents.yaml and servers.yaml files loaded by mcplab at startup. Library items are available to all eval configs without explicit $ref — you reference them by id.

Pass --libraries-dir when starting mcplab app to point it at a library directory. See the App / Library docs for managing library content through the UI.

agents.yaml (library file)

agents:
  - id: claude-haiku
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0

  - id: gpt4o-mini
    provider: openai
    model: gpt-4o-mini
    temperature: 0

using a library agent in eval.yaml

# No agents block needed — claude-haiku comes from the library
scenarios:
  - id: basic-test
    agent: claude-haiku
    servers: [my-server]
    prompt: Complete the task.