WebTest AI Config and API Reference

This document is the single reference for WebTest AI configuration, CLI commands, Markdown spec fields, step language, reports, artifacts, and programmatic APIs.

The short rule:

Specs describe user intent and expected behavior.
Config describes environment, runtime policy, driver selection, reporting, healing, memory, models, and auth defaults.
CLI flags select a run and override operational choices for one invocation.

Configuration Loading

WebTest AI loads JSON config from webtest-ai.config.json by default. Pass --config <path> to use another file. Missing config files are allowed; built-in defaults are used.

Config is deep-merged over defaults. Arrays replace default arrays instead of merging element-by-element.

Implemented loader:

src/config/loadConfig.js
public helpers: loadConfig(configPath), getDefaultConfig()

Default shape:

{
  "browser": null,
  "execution": {
    "timeouts": {
      "testTimeout": 30000,
      "navigationTimeout": 30000,
      "actionTimeout": 10000,
      "expectTimeout": 10000,
      "requestTimeout": 15000
    },
    "retries": 0,
    "skip": {
      "tags": []
    },
    "only": {
      "enabled": false,
      "tags": []
    },
    "artifacts": {
      "trace": "retain-on-failure",
      "screenshot": "only-on-failure"
    },
    "journey": {
      "enabled": false,
      "capture": "navigation",
      "maxSnapshots": 20
    },
    "quality": {
      "a11y": {
        "mode": "warn",
        "failOnSerious": true,
        "maxSerious": 0,
        "captureArtifactsOnWarning": true
      },
      "vitals": {
        "mode": "warn",
        "lcpMaxMs": 2500,
        "clsMax": 0.1,
        "percentiles": ["p75"],
        "captureArtifactsOnWarning": true
      }
    }
  },
  "reporting": {
    "redact": {
      "enabled": true,
      "headers": ["authorization", "cookie", "set-cookie", "x-api-key"],
      "queryParams": ["token", "session", "email", "password"],
      "patterns": [
        {
          "name": "email",
          "regex": "[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}",
          "flags": "gi",
          "replacement": "[REDACTED_EMAIL]"
        },
        {
          "name": "bearer",
          "regex": "Bearer\\s+[A-Za-z0-9._-]+",
          "flags": "g",
          "replacement": "Bearer [REDACTED_TOKEN]"
        }
      ]
    },
    "network": {
      "excludeUrls": []
    }
  },
  "driver": {
    "name": "playwright",
    "package": null,
    "path": null,
    "require": [],
    "options": {}
  },
  "targets": {},
  "intent": {
    "outcomeConfidenceThreshold": 0.8,
    "memoryPath": ".webtest-ai/intent-plans.json",
    "memoryConfidenceThreshold": 0.9,
    "memoryStaleAfterDays": null,
    "visionEvidence": false
  },
  "healing": {
    "enabled": false,
    "mode": "bounded",
    "confidenceThreshold": 0.8,
    "maxAttemptsPerStep": 1,
    "snapshot": {
      "maxCandidates": 20,
      "maxTextLength": 120,
      "maxNearbyTextLength": 280
    }
  },
  "memory": {
    "enabled": true,
    "path": ".webtest-ai/ui-inventory.json",
    "mode": "propose",
    "maxCandidatesPerIntent": 5,
    "staleAfterDays": 60
  },
  "visual": {
    "baselineDir": "visual-baselines",
    "updateBaselines": false,
    "mode": "fail"
  },
  "models": {
    "activeProfile": null,
    "profiles": {},
    "writePolicy": {
      "roots": ["specs", "artifacts", ".webtest-ai"],
      "extensions": [".md", ".json", ".js"]
    }
  }
}

Browser Config

browser: Optional default browser override. Supported aliases are chromium, chrome, google-chrome, edge, msedge, firefox, webkit, and safari. Resolution order is CLI --browser, config browser, suite/test browsers, then chromium.

Target Matrix Config

targets: Optional named target matrix for whitelabel, environment, or locale variants. A target can define baseUrl, locale, brand, optional app, and optional intentAliases.

targets.<name>.baseUrl: Base URL used for relative Open steps when that target is selected.

targets.<name>.intentAliases: Optional resolver hints. Keys are canonical intents from specs or generated plans, and values are target-local labels or translated phrases. These hints help bounded intent resolution; they are not exact assertion requirements.

Select targets with suite front matter targets: [brand-a], per-test metadata targets: [brand-a], or CLI --target brand-a,brand-b. CLI target selection overrides suite/test target lists for that run.

Intent Config

intent.outcomeConfidenceThreshold: Minimum model confidence for Assert outcome "<intent>". Defaults to 0.8.

intent.memoryPath: Reviewed bounded-plan memory path. Defaults to .webtest-ai/intent-plans.json.

intent.memoryConfidenceThreshold: Minimum confidence for reusing a reviewed plan before calling the model. Defaults to 0.9.

intent.memoryStaleAfterDays: Optional review window for approved intent-plan memory. When set, webtest-ai intent-memory stale and prune-stale use lastSeenAt to list or remove reviewed plans older than this many days. Defaults to null, which disables stale-plan pruning.

intent.visionEvidence: When true, Assert outcome captures a bounded screenshot artifact and includes it in semantic outcome evidence only if the active model profile declares capabilities.vision: true. This requires the active driver to advertise screenshots. Defaults to false; text, URL, network, alias, and a11y evidence remain the default path.

Intent mode is enabled per suite or test with modelMode: intent. In that mode, ## Goal is executable: WebTest AI collects bounded page evidence such as URL, page fingerprint, visible text, accessibility candidates, and DOM action candidates; asks the configured model profile for a bounded plan; validates that plan; converts it to supported WebTest AI actions; executes those actions through the normal driver; and then runs ## Expected assertions. The model can only choose from allowed action types and approved ## Data/environment-backed values.

Execution Selection And Timeout Config

execution.timeouts.testTimeout: Reserved per-test timeout policy. Defaults to 30000.

execution.timeouts.navigationTimeout: Timeout for Open, Assert url, and Wait for url actions. Defaults to 30000.

execution.timeouts.actionTimeout: Timeout for click, fill, submit, screenshot, frame, and text assertion actions. Defaults to 10000.

execution.timeouts.expectTimeout: Reserved assertion timeout policy. Defaults to 10000.

execution.timeouts.requestTimeout: Timeout for Wait for network actions. Defaults to 15000.

execution.retries: Default retry count for failed tests. Defaults to 0. Test retryPolicy overrides suite defaults.retryPolicy, and both override this config value.

execution.skip.tags: Tags skipped in addition to the built-in skip tag.

execution.only.enabled: Forces only-mode for the selected run. Only tests tagged only or matching execution.only.tags run.

execution.only.tags: Additional tags treated as only-mode selectors.

Reporting Config

reporting.redact.enabled: Enables redaction in WebTest AI-owned JSON reports, HTML reports, network summaries, console errors, step output, healing traces, healing snapshots, and generated memory proposals. Defaults to true.

reporting.redact.headers: Header names whose values are replaced with [REDACTED] when header-like blocks are rendered.

reporting.redact.queryParams: Query parameter names whose values are replaced with [REDACTED] in URLs.

reporting.redact.patterns: Regex replacements applied to text fields. Each entry supports name, regex, flags, and replacement.

reporting.network.excludeUrls: URL substrings omitted from WebTest AI-owned network summaries and HTML trace preview tables.

Privacy boundary:

Redaction applies to WebTest AI-owned outputs.
Raw Playwright trace archives may still contain original captured data.
Use trace retention policy later if raw artifact privacy is required.

Execution Quality Config

execution.quality.a11y.mode: off, warn, or fail. Defaults to warn.

execution.quality.a11y.failOnSerious: When true, serious or critical axe violations create a quality finding. Defaults to true.

execution.quality.a11y.maxSerious: Allowed serious/critical axe violations before a finding is emitted. Defaults to 0.

execution.quality.a11y.captureArtifactsOnWarning: Reserved policy field for warning-time evidence capture. Defaults to true.

execution.quality.vitals.mode: off, warn, or fail. Defaults to warn.

execution.quality.vitals.lcpMaxMs: Largest Contentful Paint threshold in milliseconds. Defaults to 2500.

execution.quality.vitals.clsMax: Cumulative Layout Shift threshold. Defaults to 0.1.

execution.quality.vitals.percentiles: Reserved reporting field. Defaults to ["p75"].

execution.quality.vitals.captureArtifactsOnWarning: Reserved policy field for warning-time evidence capture. Defaults to true.

Quality findings are collected through driver session capabilities. When a quality check is enabled, the active driver must advertise the matching a11y or vitals capability or the test is skipped with a clear missing-capability reason. warn records findings without changing functional status; fail turns an otherwise-passing run into a failed result with qualityFailures.

Execution Artifact And Profile Config

execution.artifacts.trace: off, always, or retain-on-failure. Defaults to retain-on-failure.

execution.artifacts.screenshot: off, always, or only-on-failure. Defaults to only-on-failure.

execution.journey.enabled: Enables journey snapshots. Defaults to false.

execution.journey.capture: navigation captures after URL-changing passed actions. step captures after every passed action. off disables capture.

execution.journey.maxSnapshots: Maximum journey screenshots per test. Defaults to 20.

Runtime profiles are shortcut overlays:

fast: disables traces, journey, a11y, and vitals; keeps screenshots only on failure.
ci: keeps traces and screenshots only on failure; leaves quality policy from config.
debug: keeps trace and screenshot always, enables step-level journey snapshots, and keeps quality in warn mode.

Driver Config

driver.name: Execution driver name. Built-in values are cdp, chrome-cdp, playwright, and fake for tests. The current compatibility default is playwright; intent-oriented v2 configs should select cdp explicitly when Chromium/CDP is the desired primary path.

driver.path: Optional local module path for a custom driver.

driver.package: Optional package name for a custom driver.

driver.require: Capability names that must be supported by the active driver. A run fails early if the driver does not advertise a required capability.

driver.options: Driver-specific options passed through to built-in or custom driver factories.

The built-in CDP driver talks directly to the Chrome DevTools Protocol. Selecting driver.name: "cdp" or "chrome-cdp" means WebTest AI execution calls the CDP driver directly; it does not route actions through Playwright. The CDP driver supports the same Chromium WebTest AI workflow surface as the Playwright driver: open, click, fill, submit, text/URL assertions, semantic outcome assertions, network waits, screenshots, visual checkpoints, CDP JSON traces, axe scans, Web Vitals, popup target switching, cross-origin frame execution contexts, bounded healing snapshots, and reusable browser storage state. It also exposes a text-first accessibility/action surface for intent resolution and semantic outcome evidence. CDP auth state consumes the same saved-state file used by Playwright-compatible auth flows: cookies plus per-origin localStorage, with sessionStorage preserved when present.

The built-in Playwright driver remains available for multi-engine coverage, Playwright trace artifacts, richer debugger workflows, and helper UI-login bootstrapping. Non-Chromium browsers remain Playwright-only because CDP is a Chromium protocol. Custom drivers should export a function, createDriver, or launchDriver, and return a driver object with capability flags and a createSession() method.

Example built-in CDP config:

{
  "driver": {
    "name": "cdp",
    "require": ["actions", "assertions", "screenshots"],
    "options": {
      "executablePath": "/path/to/chrome"
    }
  }
}

The CDP driver may use the Playwright package only to locate the bundled Chromium executable when no system Chrome path is configured; execution still goes through CDP, not Playwright APIs.

Existing browser endpoint route:

If an IDE, agent environment, or manually launched Chrome exposes a Chrome DevTools Protocol WebSocket endpoint, WebTest AI can connect through the CDP example driver instead of launching a new browser:

{
  "driver": {
    "name": "cdp",
    "options": {
      "wsEndpoint": "ws://127.0.0.1:9222/devtools/browser/<browser-id>"
    }
  }
}

Endpoint resolution order is driver.options.wsEndpoint, driver.options.endpoint, WEBTEST_AI_CDP_WS_ENDPOINT, then CURSOR_BROWSER_WS_ENDPOINT. Cursor's public browser-tool docs do not currently document a CDP WebSocket endpoint; WebTest AI can drive the Cursor browser only if such an endpoint is exposed.

Capability-aware execution:

actions: open, click, fill, and submit steps.
assertions: text and URL assertions.
screenshots: explicit Capture screenshot steps.
network: Wait for network steps.
frames: frame-targeted steps. CDP advertises cross-origin when it can execute through frame-specific runtime contexts.
popups: page switching.
authState: tests with auth metadata and reusable browser storage state.
healingSnapshots: opt-in bounded healing for Click action, Click element, Click text, and Click role.
a11ySurface: accessibility-tree action surface used before DOM locator fallback when the driver provides it.
a11y: axe-based accessibility scanning.
vitals: LCP/CLS collection.

When a test needs a capability that the active driver does not advertise, WebTest AI records the test as skipped with a clear reason.

Healing Config

healing.enabled: Enables bounded runtime healing for supported actions. Defaults to false.

healing.mode: Current implemented mode is bounded. The value is reserved for future expansion.

healing.confidenceThreshold: Minimum candidate confidence required for a healed candidate. Defaults to 0.8.

healing.maxAttemptsPerStep: Reserved policy field. Defaults to 1.

healing.snapshot.maxCandidates: Maximum visible interactive candidates captured in one healing snapshot.

healing.snapshot.maxTextLength: Maximum length for candidate text-like fields.

healing.snapshot.maxNearbyTextLength: Maximum nearby context text length per candidate.

Implemented healing applies to Click action "<intent>" and bounded fallback recovery for Click element, Click text, and Click role after deterministic click failure. When a11ySurface is available, click resolution and healing prefer bounded accessibility-tree candidates before DOM fallback; if deterministic ranking is tied or too low-confidence and a model profile is enabled, WebTest AI asks the sidecar to rank only the collected candidates. This lets canonical steps such as Click action "open modal" resolve to branded controls such as Launch preview without writing the exact label into the spec. Click action requires healingSnapshots when healing is enabled; explicit click steps run deterministically on smaller drivers and use healing fallback only when snapshots are available. Healing ranks only candidates collected by WebTest AI or the active driver, refuses unsafe or ambiguous candidates, ignores non-actionable context nodes, and emits reviewed memory/spec patch artifacts after verified recovery.

Memory Config

memory.enabled: Enables generated UI inventory lookup. Defaults to true.

memory.path: Filesystem JSON inventory path. Defaults to .webtest-ai/ui-inventory.json.

memory.mode: Inventory write policy.

Supported policy values:

propose: write proposal artifacts after verified healing.
auto: write verified healing directly to memory.path.
read: intended read-only mode; current behavior reads memory and still writes proposal artifacts for verified healing.
review-required: intended review workflow using webtest-ai heal queue, pending, approve-pending, and reject-pending.
off: intended off mode; today use memory.enabled: false for equivalent behavior.

memory.maxCandidatesPerIntent: Maximum locator candidates retained per app/page/intent.

memory.staleAfterDays: Ignores memory candidates whose lastSeenAt is older than this many days.

Intent memory uses the global memory.enabled and memory.mode gates, but stores reviewed bounded plans separately from selector inventory.

intent.memoryPath: Filesystem JSON path for reviewed intent-plan memory. Defaults to .webtest-ai/intent-plans.json.

intent.memoryConfidenceThreshold: Minimum reviewed-plan confidence needed to reuse a bounded plan before calling the model. Defaults to 0.9.

Successful model-generated intent plans emit intent-plans.proposed.json in the test artifact directory when memory is enabled and not off. Review and promote those entries into intent.memoryPath before they can bypass model planning.

Models Config

models.activeProfile: Name of the single active model profile for optional model workflows. null disables model calls.

models.profiles.<name>.provider: Adapter key. Supported local/open-compatible keys include ollama, openai-compatible, vllm, lmstudio, llama.cpp, and openrouter-compatible. Legacy flat provider values are accepted temporarily for compatibility, but new config should use profiles.

models.profiles.<name>.model: Model name sent to the provider.

models.profiles.<name>.endpoint: Provider endpoint. ollama uses <endpoint>/api/chat; OpenAI-compatible adapters use <endpoint>/v1/chat/completions unless the endpoint already ends with /chat/completions.

models.profiles.<name>.apiKeyEnv: Optional environment variable that contains a bearer token.

models.profiles.<name>.capabilities: Explicit booleans for router/session requirements, including structuredJson, reasoning, toolCalling, streaming, and vision.

models.profiles.<name>.limits: Request and session bounds such as timeoutMs, retries, maxInputBytes, maxOutputTokens, and maxSessionTurns.

models.writePolicy: Roots and extensions enforced before guarded autonomous maintenance applies generated writes. The policy resolves paths against the workspace and blocks traversal or writes outside approved test-related surfaces.

Model-enabled paths record call metadata in run or recording results: purpose, mode, provider, model, profile, timing, message count, prompt byte size, status, and error message when applicable. Prompts, responses, API keys, and auth values are not stored in this metadata.

The sidecar only ranks bounded candidates already produced by WebTest AI. Discovery and maintenance workflows return structured proposals and apply writes only after centralized write-policy checks.

For live-provider smoke testing with Qwen, DeepSeek, Kimi, GLM4, and Yi, see Live Model Testing.

Current Runtime Defaults Not Yet Configurable

These values are implemented as hardcoded runtime defaults today and are good candidates for future config or CLI flags:

artifacts root: artifacts/
reports root: artifacts/reports/
auth storage directory: playwright/.auth/
popup detection timeout: 1500 ms
browser launch executable override for Chromium
report environment: always local

CLI API

Entrypoint:

node ./src/cli/index.js <command> [flags]

Installed package entrypoint:

webtest-ai <command> [flags]

`run`

Runs one Markdown suite file or a directory of Markdown suite files.

node ./src/cli/index.js run --suite specs/webtest-ai-demo.md --config webtest-ai.config.json

Flags:

--suite <path>: Markdown suite file or directory. Defaults to specs/.
--config <path>: JSON config path. Defaults to webtest-ai.config.json.
--browser <name>: browser override. Supported aliases are chromium, chrome, google-chrome, edge, msedge, firefox, webkit, and safari.
--target <a,b> / --targets <a,b>: run selected configured targets from config.targets.
--tags <a,b>: include tests that contain all listed tags.
--exclude-tags <a,b>: exclude tests containing any listed tag.
--workers <n>: parallel worker capacity. Defaults to 1.
--headed: launch browser headed.
--debug: enables Playwright Inspector behavior and pauses before test execution.
--refresh-auth: ignore reusable storage state and create fresh UI login state where applicable.
--auth-profile <name>: override the auth profile name used for reusable sessions.
--profile fast|ci|debug: apply a runtime profile overlay.
--trace off|on|always|retain-on-failure: override trace policy for this run. on and always both retain traces for passing and failing tests.
--screenshot off|always|only-on-failure: override final screenshot policy for this run.
--quality off: disable a11y and vitals collection for this run.
--journey off|navigation|step: override journey snapshot capture for this run.

Run output prints suite name, counts, worker count, run id, config path, JSON report path, and HTML report path.

`debug`

Shorthand for run --debug.

node ./src/cli/index.js debug --suite specs/webtest-ai-demo.md --tags public

This sets PWDEBUG=1, forces headed behavior through debug mode, and pauses with Playwright Inspector.

`heal`

Reviews, queues, approves, rejects, or publishes generated UI inventory proposals.

node ./src/cli/index.js heal list --proposal artifacts/<runId>/<testId>/ui-inventory.proposed.json

Subcommands:

list / show: print proposed inventory updates.
queue: store proposal updates in the pending review file.
pending: list queued pending updates.
approve: merge proposal updates into configured inventory.
approve-pending: promote queued pending updates into inventory.
reject-pending: remove queued pending updates.
publish-pr: write local PR metadata stub only.

Flags:

--proposal <path>: proposal JSON path. Required except for pending, approve-pending, and reject-pending.
--config <path>: config used to resolve memory.path.
--memory <path>: override memory.path for this command.
--app <name>: fallback app key for older proposal files.
--index <n>: select one pending update.
--all: select all pending updates.
--patch <path>: optional patch markdown path for publish-pr.
--title <title>: optional PR stub title.
--dry-run: preview approve or approve-pending without writing inventory or consuming pending review items.

`intent-memory`

Reviews and approves bounded intent-plan proposals generated by successful intent-mode runs.

node ./src/cli/index.js intent-memory list --proposal artifacts/<runId>/<testId>/intent-plans.proposed.json
node ./src/cli/index.js intent-memory approve --proposal artifacts/<runId>/<testId>/intent-plans.proposed.json --dry-run

Subcommands:

list / show: print proposed intent-plan updates.
approve: merge reviewed proposal updates into intent.memoryPath.

Flags:

--proposal <path>: proposal JSON path.
--config <path>: config used to resolve intent.memoryPath.
--memory <path>: override intent.memoryPath for this command.
--dry-run: preview approve without writing intent memory.

`discover`

Runs a bounded model discovery workflow and emits candidate test-flow proposals. It uses the active model profile and does not affect deterministic run pass/fail truth.

node ./src/cli/index.js discover --url https://app.example --dry-run
node ./src/cli/index.js discover --url https://app.example --output artifacts/discovery/app.json
node ./src/cli/index.js discover --target brand-a-en --target brand-b-fr --dry-run
node ./src/cli/index.js discover --targets brand-a-en,brand-b-fr --dry-run
node ./src/cli/index.js discover --target all --explore --max-pages 5 --inventory-output artifacts/discovery/ui-inventory.proposed.json --spec-output artifacts/discovery/discovered.md
node ./src/cli/index.js discover --target all --explore --auth-profile ci-user --refresh-auth --dry-run

Flags:

--url <seed-url> / --seed <seed-url>: seed URL. May be repeated.
--target <name|all> / --targets <name,name|all>: expand configured targets into seed URLs and pass target brand, locale, and intentAliases into discovery.
--explore: open seed URLs with the active driver, crawl same-origin links within budget, and include bounded page evidence in discovery.
--max-pages <n>: maximum pages to visit during browser-backed exploration. Defaults to 5.
--max-depth <n>: same-origin link depth for browser-backed exploration. Defaults to 1.
--browser <name>: browser engine or channel for browser-backed exploration.
--auth-profile <name>: prepare reusable/API auth state for browser-backed exploration and use the named profile.
--auth-mode <mode>: override configured auth mode for browser-backed exploration.
--refresh-auth: ignore reusable auth state and create fresh state where applicable.
--config <path>: JSON config path. Defaults to webtest-ai.config.json.
--max-turns <n>: override the active profile session turn limit.
--output <path>: write the normalized discovery proposal JSON.
--inventory-output <path>: write proposed UI inventory updates when discovery returns them.
--spec-output <path>: write proposed flows as reviewable Markdown.
--dry-run: print the proposal summary without writing an artifact.

When memory is enabled, discovery reads the configured UI inventory path and includes it as reviewed context for proposed flows. With --explore, discovery also includes live browser evidence: URL patterns, page fingerprints, accessibility/action candidates, and same-origin links. Auth flags prepare browser storage state before exploration so authenticated pages can be inventoried without exposing credentials to the model. Discovery still returns proposals only; it does not decide deterministic run pass/fail truth, and UI inventory changes are written only as reviewable proposal artifacts.

`record`

Records a manual browser flow and generates a Markdown spec plus recording artifacts. Markdown is the source-of-truth contract.

node ./src/cli/index.js record --url https://app.example --output specs/recorded-flow.md

Flags:

--url <url>: URL to open and record. Required.
--output <path>: Markdown spec output path. Defaults to specs/recording-<timestamp>.md.
--emit-starter: also write an optional Playwright starter test for migration/debugging.
--starter <path>: starter-test output path. Passing this flag implies --emit-starter.
--event-log <path>: redacted event-log output path.
--api-manifest <path>: redacted API/network manifest output path.
--inventory-proposal <path>: UI inventory proposal output path.
--name <name>: generated test name.
--tags <a,b>: generated suite/test tags.
--headless: run browser headless for scripted capture. Recorder defaults to headed.
--force: overwrite generated Markdown and optional starter files.
--save-auth: save storageState after recording.
--auth-profile <name>: profile name for saved auth state. Defaults to recorded-user.
--record-auth-flow: include login/MFA-like steps as env placeholders. Without this, recorder trims obvious username/password/MFA segments and prefers saved-session reuse when --save-auth is used.
--capture-network manifest|full: writes a redacted API manifest. full currently adds a privacy warning; request/response bodies are not stored by the MVP.
--smart: optionally clean up goal, tags, steps, and expected assertions with one model call when model config is enabled. Deterministic output remains the fallback.

Recorder outputs:

Markdown spec
optional Playwright starter test when --emit-starter or --starter is used
redacted event log
redacted API manifest with Wait for network suggestions
reviewable ui-inventory.proposed.json selector-memory proposal
optional saved auth state when --save-auth is used

Interactive auth/MFA policy:

The user enters live MFA codes in the headed browser during recording.
Passwords, OTPs, tokens, and similar values are redacted from event logs.
Generated login-flow specs use env placeholders such as WEBTEST_AI_PASSWORD and WEBTEST_AI_MFA_CODE.
Post-login business-flow specs should use saved storageState reuse instead of replaying MFA.

`report`

Exports WebTest AI report data for external integrations. The command is vendor-neutral and does not call Jira, Slack, MCP servers, or other remote services.

node ./src/cli/index.js report export --report artifacts/reports/<runId>.json
node ./src/cli/index.js report summary --report artifacts/reports/<runId>.json --format markdown

Flags:

--report <path>: source WebTest AI JSON report. Required.
--output <path>: export path. Defaults beside the source report.
--format json|ndjson: export output format. Defaults to json.
--format markdown|text|json: summary output format. Defaults to markdown.
--stdout: write export content to stdout instead of a file.
--force: overwrite an existing export file.

Exported JSON shape:

version
kind: "webtest-ai.integration-report"
source: source report metadata
run: run metadata, filters, scheduling, browser, and driver
summary: status counts and run success boolean
tests: compact per-test records with status, tags, timing, failure reason, quality findings, model-call metadata, counts, and artifact paths

Use this output to build project-specific integrations, CI annotations, dashboards, MCP resources/tools, Jira tickets, Slack messages, or any other workflow outside WebTest AI core.

CI summary output:

kind: "webtest-ai.ci-summary" for JSON summaries
status counts and success boolean
failed, blocked, and healed tests that need attention
first failed step and artifact paths when available

`mcp`

Starts the WebTest AI MCP server over stdio.

node ./src/cli/index.js mcp

Flags:

--config <path>: config used by MCP tools that need WebTest AI runtime policy.

The MCP server exposes WebTest AI data and operations. It does not call vendor services or publish to external systems.

The browser-session MCP tools use Playwright Chromium directly because they are explicit debug and recording helpers. They are separate from deterministic webtest-ai.run.suite execution, which still goes through the configured driver boundary and owns pass/fail truth.

Resources:

webtest-ai://reports/latest
webtest-ai://reports/latest/integration
webtest-ai://reports/{name}
webtest-ai://reports/{name}/integration
webtest-ai://specs
webtest-ai://specs/{path}
webtest-ai://artifacts/{path}
webtest-ai://schemas/integration-report-v1

Tools:

webtest-ai.run.suite
webtest-ai.report.list
webtest-ai.report.summary
webtest-ai.report.export
webtest-ai.report.ci_summary
webtest-ai.spec.parse
webtest-ai.browser.start_session
webtest-ai.browser.click
webtest-ai.browser.fill
webtest-ai.browser.assert_text
webtest-ai.browser.snapshot
webtest-ai.browser.screenshot
webtest-ai.browser.to_spec
webtest-ai.browser.stop_session

Browser MCP sessions are for explicit debug, recording, and agent-assisted exploration. webtest-ai.browser.snapshot returns accessibility-tree candidates with session-local ax* refs when the browser exposes an a11y snapshot. webtest-ai.browser.click can target those refs with ref, or continue to use selector, role/name, or text inputs. Normal CI pass/fail truth still belongs to deterministic WebTest AI runs.

webtest-ai.run.suite runs a Markdown suite through the same deterministic runner used by the CLI and writes normal JSON/HTML reports. Runner progress logs are sent to stderr so MCP stdout remains valid protocol output.

Reserved Commands

No SaaS-specific publisher commands are built into core.

Programmatic recorder APIs exist for recordFlow(), generateSpec(), and normalized recorder output. Report export APIs exist for buildIntegrationExport(), writeIntegrationExport(), and publishResults(). publishResults() returns the same vendor-neutral export payload and does not publish to an external service.

Recommended Future CLI Overrides

These flags are not implemented yet, but they are the natural split between specs and runtime operation:

--base-url <url>
--env <name>
--timeout <ms>
--navigation-timeout <ms>
--report-dir <path>
--artifacts-dir <path>
--healing on|off
--memory-mode propose|auto|read|off
--run-id <id>

Markdown Suite API

One Markdown file is one suite. YAML front matter defines suite metadata. Each top-level # heading defines one test.

Suite Front Matter

Current supported fields:

suite: Suite name used for run ids and reports. Defaults to unnamed-suite.

app: App key used by generated UI inventory. If omitted, WebTest AI derives an app key from baseUrl or suite name.

baseUrl: Base URL used by relative Open steps and UI login. Currently required for relative paths. This is a strong candidate to move to config/CLI.

tags: Tags inherited by every test in the suite.

defaults: Parsed and copied into each normalized test, but not heavily used by runtime yet.

hooks: Suite-level hooks. hooks.beforeEach and hooks.afterEach are prepended/appended to each normalized test when present.

auth: Suite-level auth policy. Test metadata can override it. This is a strong candidate to move to config auth profiles.

mcpProfile: Reserved integration profile. Defaults to default.

browsers: Browser list. Current runtime uses only the first suite browser.

modelMode: When set to intent, ## Goal is treated as executable and planned by the bounded model sidecar before ## Expected runs.

targets: Optional list of configured target names to run for the suite.

Test Metadata

Metadata appears between a test heading and the first ## section.

Supported fields:

tags: Test tags. Suite tags and test tags are combined and deduplicated.

priority: Reporting metadata. Defaults to normal.

owner: Reporting metadata. Defaults to null.

auth: Per-test auth override merged over suite auth.

browsers: Parsed into normalized test objects, but current runtime does not launch per-test browsers.

modelMode: intent enables executable goal planning for this test. Test metadata overrides suite modelMode.

targets: Optional list of configured target names for this test. Test metadata overrides suite targets unless CLI --target is supplied.

retryPolicy: Parsed and enforced by runtime. A test-level value overrides defaults.retryPolicy, and both override execution.retries from config.

Sections

Supported section headings:

## Goal
## Before
## BeforeEach
## Steps
## Expected
## AfterEach
## After
## Data
## Notes

Steps, Expected, hooks, and Notes become lists. Goal becomes one text string. Data becomes key/value entries split on :.

Hook sections are supported for setup and cleanup:

## Before: runs before the test's main steps.
## BeforeEach: suite-level hook in frontmatter as hooks.beforeEach, or a test section when needed.
## AfterEach: suite-level hook in frontmatter as hooks.afterEach, or a test section when needed.
## After: runs after expected assertions.

Hooks use the same step language as ## Steps. Keep hooks short and visible; business behavior should stay in ## Steps and ## Expected.

## Steps and ## Expected may use numbered/bulleted lists or a one-column Markdown table with a Step header:

| Step |
| --- |
| Open "/" |
| Click role "button" named "Run" |

## Data may use key/value lines or a Markdown table. Use name: env ENV_VAR to approve an environment-backed value for intent-mode filling without putting the secret in Markdown. CSV files are not part of this pass; add them later only as an explicit data-file expansion feature, not as a replacement for the Markdown contract.

Step Language API

The interpreter is rule-based. Unsupported text fails with Unsupported step.

Supported step forms:

Open "<url-or-path>"
Click action "<intent>"
Click element "<name>"
Click text "<text>"
Click role "<role>" named "<name>"
Click frame role "<role>" named "<name>"
Click frame "<frame-name-or-url-part>" role "<role>" named "<name>"
Fill field "<label>" with "<value>"
Fill field "<label>" with env "<ENV_VAR>"
Fill intent "<field-intent>" with data "<data-key>"
Fill intent "<field-intent>" with env "<ENV_VAR>"
Fill frame field "<label>" with "<value>"
Fill frame field "<label>" with env "<ENV_VAR>"
Submit form
Submit form with button "<name>"
Assert text "<text>"
Assert outcome "<semantic-outcome>"
Assert frame text "<text>"
Assert frame "<frame-name-or-url-part>" text "<text>"
Assert url "<url>"
Wait for url "<url>"
Wait for network "<url-part>"
Wait for network "<url-part>" status <status>
Switch page "<alias-title-or-url-part>"
Capture screenshot
Capture screenshot as "<fileName>"
Capture visual checkpoint "<name>"

Current locator behavior:

Click role resolves through the driver accessibility-tree surface first when available, then falls back to Playwright getByRole(role, { name, exact: false }).
Click text resolves through the driver accessibility-tree surface first when available, then falls back to getByText(text, { exact: false }).first().
Fill field uses getByLabel(label, { exact: false }).
Fill intent fills only approved values from ## Data or env refs and resolves the field intent through the driver/accessibility surface when exact label matching is not enough.
Click element resolves through the driver accessibility-tree surface first when available, then tries button role, link role, label, placeholder, then text.
Assert outcome asks the configured model profile to evaluate a human-written semantic outcome against bounded page evidence. It must return a matching result above intent.outcomeConfidenceThreshold; Assert text remains the exact text check.
Submit form calls requestSubmit() on the first form unless a button name is supplied.
Frame steps choose the first non-main frame unless a frame name or URL part is supplied.
Switch page changes the active page after a popup or additional page exists. Drivers match stable aliases such as main, page-1, and popup-1, then exact page titles, then a unique URL substring; ambiguous or missing matches fail the step.
Popup-producing clicks are followed by setting the active page when a popup is detected within 1500 ms.
Visual checkpoints capture a full-page screenshot, create a missing baseline, and compare later runs with exact image hash matching.

Visual Checkpoint Config

visual.baselineDir: Directory for visual checkpoint baselines. Defaults to visual-baselines.

visual.updateBaselines: When true, visual checkpoint actions overwrite existing baselines with the current screenshot.

visual.mode: fail or warn. Defaults to fail. In warn mode, changed visual checkpoints write diff metadata without failing the step.

Visual checkpoint artifacts:

actual screenshot: artifacts/<runId>/<testId>/visual-<name>.png
diff metadata: artifacts/<runId>/<testId>/visual-<name>.diff.json
baseline: <visual.baselineDir>/<suite>/<testId>/<name>.png

Auth API

Current auth metadata fields:

mode: commonly ui, reuse, or api.
profile: storage-state profile name.
saveStorageState: when true, passing UI auth tests persist storage state.
usernameEnv: username environment variable.
passwordEnv: password environment variable.
mfaCodeEnv: MFA code environment variable.
otp: optional OTP provider config. Supported providers are totp-secret and static-env.
validationPath / validationUrl: reusable-session validation target.
validate: set false to skip reusable-session validation.
apiLoginPath / apiLoginUrl: API endpoint used by mode: api.
method: API login method, defaulting to POST.
bodyFormat / contentType: json or form, defaulting to json.
usernameField: API login username field, defaulting to username.
passwordField: API login password field, defaulting to password.
mfaCodeField: API login MFA field, defaulting to mfaCode.
extraFields / body: extra API login request fields.
headers: extra API login request headers.
successStatus / successStatuses: accepted API login response status codes.
requireCookie: set false to allow API login responses without Set-Cookie.
loginMode: same-host or idp.
provider: idp enables IDP login behavior.
loginPath: login page path.
successPath: URL waited for after UI login.
successText: text waited for after UI login.
usernameLabel: username field label.
passwordLabel: password field label.
submitName: login submit button name.
mfaLabel: MFA input label.
mfaSubmitName: MFA submit button name.
mfaTimeout: how long to look for the MFA field.
idpStartName: IDP start link name, or false to skip that click.

OTP examples:

auth:
  mode: reuse
  profile: ci-user
  usernameEnv: APP_USERNAME
  passwordEnv: APP_PASSWORD
  otp:
    provider: totp-secret
    secretEnv: APP_TOTP_SECRET
    digits: 6
    period: 30

auth:
  otp:
    provider: static-env
    codeEnv: APP_MFA_CODE

mfaCodeEnv remains supported for compatibility and behaves like static-env.

Current storage state path:

playwright/.auth/<profile>.json

Recommended direction:

Keep auth flow definitions in config.
Let specs reference a logical profile or auth requirement.
Use CLI for one-off overrides such as --auth-profile and --refresh-auth.

API login creates the same browser-ready storage state shape as UI login. The default endpoint is /api/login, and cookies returned by Set-Cookie are written to playwright/.auth/<profile>.json. Drivers must advertise authState to consume that state during execution. The CDP driver can also save storage state from the active page, including matching-origin cookies, localStorage, and sessionStorage.

Reusable session validation opens the saved storage state in a browser context when the active driver exposes a native browser for validation, and treats an OK response from validationPath or validationUrl as a valid session. CDP execution can apply existing reusable state directly; validation is skipped when no native validation browser is available. The endpoint must return a non-OK response for anonymous users; otherwise WebTest AI cannot distinguish a real session from public access.

Report And Artifact API

Per run:

artifacts/reports/<runId>.json
artifacts/reports/<suite>-latest.json
artifacts/reports/<runId>.html
artifacts/reports/<suite>-latest.html

Per test:

artifacts/<runId>/<testId>/final.png
artifacts/<runId>/<testId>/trace.zip
artifacts/<runId>/<testId>/trace/
optional checkpoint screenshots
optional journey screenshots
optional healing snapshots
optional UI inventory proposal and patch files
optional reviewed spec patch file

Result status values:

passed
failed
healed-passed
skipped
blocked

Important report fields:

runId
suite
suitePath
generatedAt
browser
driver
driverCapabilities
workerCount
scheduling
configPath
filters
results

Important per-test result fields:

testId
name
status
browser
browserEngine
browserChannel
browserVersion
driverName
driverCapabilities
environment
tags
priority
owner
workerId
serial
authMode
authProfile
authState
modelMode
modelCalls
target
intentPlan
semanticAssertions
healing
journey
durationMs
assertions
stepResults
firstFailedStep
networkSummary
consoleErrors
a11y
vitals
qualityFailures
qualityWarnings
artifacts
summary

The HTML report renders suite-level driver metadata, capability tags, per-test owner/priority, target metadata, generated intent plans, semantic assertion evidence, journey snapshots, artifact availability, healing traces, and skip/block reasons. This makes adapter support and degraded execution visible without opening raw JSON.

Manual Feature Validation Matrix

Use these checks when validating a local platform pass:

Recorder capture: run node ./src/cli/index.js record --url http://127.0.0.1:4010 --name "Manual recorded smoke" --tags recorded,smoke --output specs/manual-recorded-smoke.md --force, interact in the opened headed browser, then confirm the Markdown includes clicks/inputs/page opens and no .playwright.js is created.
Optional starter: rerun recorder with --emit-starter or --starter specs/manual-recorded-smoke.playwright.js and confirm Starter Test: prints only then.
Dashboard journey: run a suite with --journey navigation or --profile debug, open artifacts/reports/<suite>-latest.html, and confirm User Journey, owner, priority, and artifact policy are visible.
Fast profile: run node ./src/cli/index.js run --suite <suite.md> --profile fast; confirm traces are off, quality is skipped, and screenshots are retained only on failure.
Visual checkpoint: run a suite containing Capture visual checkpoint "Hero" twice; confirm the first run creates a baseline under visual-baselines/ and the second run matches or reports a diff according to visual.mode.
Auth reuse validation: use auth.mode: reuse with validationPath: /api/session; confirm a saved valid state is reused and a stale state falls back to login or fails clearly.
Hooks and tables: parse/run a spec with ## Before, ## Expected, and table-form ## Steps; confirm hook phases appear in stepResults.
Healing review: run node ./src/cli/index.js heal approve --proposal examples/healing-review-proposal.json --dry-run --memory /tmp/webtest-ai-ui-inventory.json; confirm it prints Would Apply: 1 and does not write inventory.
Intent memory review: run node ./src/cli/index.js intent-memory approve --proposal artifacts/<runId>/<testId>/intent-plans.proposed.json --dry-run --memory /tmp/webtest-ai-intent-plans.json; confirm it prints Would Apply and does not write memory.
Command surface: run node ./src/cli/index.js --help and node ./src/cli/index.js <command> --help for run, debug, record, heal, intent-memory, discover, report, and mcp.
Published package: run npm view @asserthive/webtest-ai@beta version dist-tags bin --json and confirm the package exposes the webtest-ai binary.
Existing CDP endpoint: set WEBTEST_AI_CDP_WS_ENDPOINT=ws://... and use examples/drivers/existing-cdp-driver.config.json; if no endpoint exists, this feature cannot drive that browser.

Programmatic API

Package entrypoint:

const {
  parseSuite,
  runSuite,
  runTest,
  createBrowserDriver,
  createFakeDriver,
  getDriverConfig,
  recordFlow,
  generateSpec,
  normalizeRecordedEvents,
  renderMarkdownSpec,
  renderStarterTest,
  buildIntegrationExport,
  buildCiSummary,
  exportResults,
  renderCiSummary,
  writeCiSummary,
  writeIntegrationExport,
  runIntentPlanner,
  assertSemanticOutcome,
  findIntentPlanMemory,
  loadIntentMemory,
  writeIntentMemoryProposal,
  createMcpServer,
  publishResults
} = require("@asserthive/webtest-ai");

`parseSuite(filePath)`

Reads a Markdown suite and returns a normalized suite object.

const suite = await parseSuite("specs/webtest-ai-demo.md");

Returned shape:

{
  filePath,
  suite,
  app,
  baseUrl,
  tags,
  defaults,
  auth,
  mcpProfile,
  browsers,
  tests
}

`runSuite(parsedSuite, options)`

Runs a parsed suite through the configured browser driver and writes JSON/HTML reports. Playwright is the default driver.

Common options:

{
  headless: true,
  debug: false,
  workers: 1,
  includeTags: [],
  excludeTags: [],
  refreshAuth: false,
  authProfile: null,
  suitePath: "/absolute/path/to/spec.md",
  config: {},
  configPath: "/absolute/path/to/webtest-ai.config.json"
}

Returns:

{
  runId,
  suite,
  workerCount,
  configPath,
  reportPath,
  htmlReportPath,
  results
}

`runTest(test, options)`

Runs one normalized test. This is lower-level than runSuite() and expects either a driver object or a launched Playwright browser plus runtime options.

Important options:

driver
browser
browserName
browserEngine
browserChannel
browserVersion
runId
workerId
sessionCache
suite
includeTags
excludeTags
refreshAuth
authProfile
debug
workerCount
suitePath
config

Driver helpers:

createBrowserDriver(options): resolves and launches the configured driver.
createFakeDriver(options): creates an in-memory driver for tests and adapter-contract checks.
getDriverConfig(config): returns the normalized driver config used by runtime driver resolution.
tests/support/driverContract.js: reusable local contract helper for checking that a driver can execute the core open/fill/click/assert/screenshot flow.

Recorder APIs

recordFlow(options): Captures a browser flow and writes the same Markdown/event/API/inventory artifacts as webtest-ai record.

generateSpec(recording, options): Converts normalized recording events into a Markdown spec model.

normalizeRecordedEvents(events, options): Converts raw recorder events into deterministic steps and suggestions.

renderMarkdownSpec(spec, options): Renders a generated spec model as Markdown.

renderStarterTest(spec, options): Renders the optional Playwright starter test used during migration/debugging.

Integration Export APIs

buildIntegrationExport(report, options): Converts a WebTest AI JSON report object into a compact vendor-neutral integration payload.

buildCiSummary(report, options): Converts a WebTest AI JSON report object into a compact CI summary payload.

renderCiSummary(summary, options): Renders a CI summary as Markdown, text, or JSON.

writeCiSummary(summary, outputPath, options): Writes a rendered CI summary.

writeIntegrationExport(exportPayload, outputPath, options): Writes the integration payload as JSON or NDJSON.

exportResults(reportOrPath, options): Returns the vendor-neutral integration payload and optionally writes it.

publishResults(reportOrPath, options): Compatibility alias for exportResults(). It does not call external services.

runIntentPlanner(options): Builds and validates a bounded plan for executable ## Goal flows.

assertSemanticOutcome(options): Evaluates Assert outcome against bounded page evidence using the configured model profile.

findIntentPlanMemory(options): Looks up a reviewed high-confidence bounded plan before model planning.

loadIntentMemory(config): Reads the configured intent-plan memory file.

writeIntentMemoryProposal(options): Writes reviewable intent-plan memory proposal artifacts.

createMcpServer(options): Creates the WebTest AI MCP server object for stdio hosts or direct tests.

Boundary Guidance

Keep in Markdown specs:

user-visible behavior
business flows
assertions
scenario-specific data
owner/priority/tags used for reporting and selection

Move or keep in config:

base URLs and environment names
auth profile details and secret env var names
default browser/project
execution driver and required capabilities
timeouts
artifact/report directories
trace and screenshot retention
reporting privacy
healing, memory, and model policy
scheduling policy defaults

Use CLI for:

choosing a suite
selecting tags
changing config path
changing worker count
headed/debug runs
refreshing auth
one-off runtime profile and browser overrides
future environment/base-url overrides when those flags are added