MCPLab

CLI

Configuration

Write eval.yaml — servers, agents, scenarios, assertions, and auth.

Structure Overview

An eval file requires `agents` and `scenarios`.

Top-level `servers` exists for backward compatibility but is deprecated. Prefer scenario-owned `mcp_servers`.

eval.yaml skeleton
agents:
  - id: claude
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0

scenarios:
  - id: basic-test
    mcp_servers:
      - id: my-server
        transport: http
        url: http://localhost:3000/mcp
    prompt: Describe what you want the agent to do.
    eval:
      tool_constraints:
        required_tools: [tool_name]

Servers

Each server entry needs an id, a transport (http for HTTP/SSE), and the URL of the MCP endpoint.

If the endpoint requires a bearer token, add a token field. Use a literal string for a hardcoded value or a $ENV_VAR reference to read from the environment.

server with bearer token
servers:
  - id: my-server
    transport: http
    url: http://localhost:3000/mcp
    token: "my-static-token"          # literal value

  - id: prod-server
    transport: http
    url: https://api.example.com/mcp
    token: $SERVER_API_TOKEN           # reads from env

Agents

Each agent entry needs an id, a provider, and a model. Supported providers are anthropic, openai, and azure.

temperature defaults to 0. Lower values produce more deterministic results which is generally better for eval consistency.

agents
agents:
  - id: claude
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0

  - id: gpt4o
    provider: openai
    model: gpt-4o
    temperature: 0

  - id: azure-gpt
    provider: azure
    model: gpt-4o
    temperature: 0

Scenarios

Inline scenarios define prompt/eval and may include `mcp_servers`.

Referenced scenarios can override only MCP target with `mcp_servers` while keeping test-case prompt/eval.

referenced scenario with mcp override
scenarios:
  - ref: add-calculations
    mcp_servers:
      - ref: kpi-api-stage

Assertions

Two types of assertions are available in the eval block.

  • tool_constraints.required_tools — list of tool names the agent MUST call.
  • tool_constraints.forbidden_tools — list of tool names the agent MUST NOT call.
  • response_assertions type: regex — the agent response must match the regular expression in pattern.
  • response_assertions type: jsonpath — evaluate JSON output at path and optionally match equals.

Reusable Refs

Use `ref` to reference library items from `agents.yaml`, `servers.yaml`, and `test-cases/`.

library refs in eval
agents:
  - ref: claude-sonnet-46

scenarios:
  - ref: add-calculations
    mcp_servers:
      - ref: kpi-api-prod

Library Files

A library is a directory of shared agents.yaml and servers.yaml files loaded by mcplab at startup. Library items are available to all eval configs without explicit $ref — you reference them by id.

Pass --libraries-dir when starting mcplab app to point it at a library directory. See the App / Library docs for managing library content through the UI.

agents.yaml (library file)
agents:
  - id: claude-haiku
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0

  - id: gpt4o-mini
    provider: openai
    model: gpt-4o-mini
    temperature: 0
using a library agent in eval.yaml
# No agents block needed — claude-haiku comes from the library
scenarios:
  - id: basic-test
    agent: claude-haiku
    servers: [my-server]
    prompt: Complete the task.