WebTest AI Config and API Reference
This document is the single reference for WebTest AI configuration, CLI commands, Markdown spec fields, step language, reports, artifacts, and programmatic APIs.
The short rule:
- Specs describe user intent and expected behavior.
- Config describes environment, runtime policy, driver selection, reporting, healing, memory, models, and auth defaults.
- CLI flags select a run and override operational choices for one invocation.
Configuration Loading
WebTest AI loads JSON config from webtest-ai.config.json by default. Pass --config <path> to use another file. Missing config files are allowed; built-in defaults are used.
Config is deep-merged over defaults. Arrays replace default arrays instead of merging element-by-element.
Implemented loader:
src/config/loadConfig.js- public helpers:
loadConfig(configPath),getDefaultConfig()
Default shape:
{
"browser": null,
"execution": {
"timeouts": {
"testTimeout": 30000,
"navigationTimeout": 30000,
"actionTimeout": 10000,
"expectTimeout": 10000,
"requestTimeout": 15000
},
"retries": 0,
"skip": {
"tags": []
},
"only": {
"enabled": false,
"tags": []
},
"artifacts": {
"trace": "retain-on-failure",
"screenshot": "only-on-failure"
},
"journey": {
"enabled": false,
"capture": "navigation",
"maxSnapshots": 20
},
"quality": {
"a11y": {
"mode": "warn",
"failOnSerious": true,
"maxSerious": 0,
"captureArtifactsOnWarning": true
},
"vitals": {
"mode": "warn",
"lcpMaxMs": 2500,
"clsMax": 0.1,
"percentiles": ["p75"],
"captureArtifactsOnWarning": true
}
}
},
"reporting": {
"redact": {
"enabled": true,
"headers": ["authorization", "cookie", "set-cookie", "x-api-key"],
"queryParams": ["token", "session", "email", "password"],
"patterns": [
{
"name": "email",
"regex": "[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}",
"flags": "gi",
"replacement": "[REDACTED_EMAIL]"
},
{
"name": "bearer",
"regex": "Bearer\\s+[A-Za-z0-9._-]+",
"flags": "g",
"replacement": "Bearer [REDACTED_TOKEN]"
}
]
},
"network": {
"excludeUrls": []
}
},
"driver": {
"name": "playwright",
"package": null,
"path": null,
"require": [],
"options": {}
},
"targets": {},
"intent": {
"outcomeConfidenceThreshold": 0.8,
"memoryPath": ".webtest-ai/intent-plans.json",
"memoryConfidenceThreshold": 0.9,
"memoryStaleAfterDays": null,
"visionEvidence": false
},
"healing": {
"enabled": false,
"mode": "bounded",
"confidenceThreshold": 0.8,
"maxAttemptsPerStep": 1,
"snapshot": {
"maxCandidates": 20,
"maxTextLength": 120,
"maxNearbyTextLength": 280
}
},
"memory": {
"enabled": true,
"path": ".webtest-ai/ui-inventory.json",
"mode": "propose",
"maxCandidatesPerIntent": 5,
"staleAfterDays": 60
},
"visual": {
"baselineDir": "visual-baselines",
"updateBaselines": false,
"mode": "fail"
},
"models": {
"activeProfile": null,
"profiles": {},
"writePolicy": {
"roots": ["specs", "artifacts", ".webtest-ai"],
"extensions": [".md", ".json", ".js"]
}
}
}
Browser Config
browser- Optional default browser override. Supported aliases are
chromium,chrome,google-chrome,edge,msedge,firefox,webkit, andsafari. Resolution order is CLI--browser, configbrowser, suite/testbrowsers, thenchromium.
Target Matrix Config
targets- Optional named target matrix for whitelabel, environment, or locale variants. A target can define
baseUrl,locale,brand, optionalapp, and optionalintentAliases.
targets.<name>.baseUrl- Base URL used for relative
Opensteps when that target is selected.
targets.<name>.intentAliases- Optional resolver hints. Keys are canonical intents from specs or generated plans, and values are target-local labels or translated phrases. These hints help bounded intent resolution; they are not exact assertion requirements.
Select targets with suite front matter targets: [brand-a], per-test metadata targets: [brand-a], or CLI --target brand-a,brand-b. CLI target selection overrides suite/test target lists for that run.
Intent Config
intent.outcomeConfidenceThreshold- Minimum model confidence for
Assert outcome "<intent>". Defaults to0.8.
intent.memoryPath- Reviewed bounded-plan memory path. Defaults to
.webtest-ai/intent-plans.json.
intent.memoryConfidenceThreshold- Minimum confidence for reusing a reviewed plan before calling the model. Defaults to
0.9.
intent.memoryStaleAfterDays- Optional review window for approved intent-plan memory. When set,
webtest-ai intent-memory staleandprune-staleuselastSeenAtto list or remove reviewed plans older than this many days. Defaults tonull, which disables stale-plan pruning.
intent.visionEvidence- When
true,Assert outcomecaptures a bounded screenshot artifact and includes it in semantic outcome evidence only if the active model profile declarescapabilities.vision: true. This requires the active driver to advertisescreenshots. Defaults tofalse; text, URL, network, alias, and a11y evidence remain the default path.
Intent mode is enabled per suite or test with modelMode: intent. In that mode, ## Goal is executable: WebTest AI collects bounded page evidence such as URL, page fingerprint, visible text, accessibility candidates, and DOM action candidates; asks the configured model profile for a bounded plan; validates that plan; converts it to supported WebTest AI actions; executes those actions through the normal driver; and then runs ## Expected assertions. The model can only choose from allowed action types and approved ## Data/environment-backed values.
Execution Selection And Timeout Config
execution.timeouts.testTimeout- Reserved per-test timeout policy. Defaults to
30000.
execution.timeouts.navigationTimeout- Timeout for
Open,Assert url, andWait for urlactions. Defaults to30000.
execution.timeouts.actionTimeout- Timeout for click, fill, submit, screenshot, frame, and text assertion actions. Defaults to
10000.
execution.timeouts.expectTimeout- Reserved assertion timeout policy. Defaults to
10000.
execution.timeouts.requestTimeout- Timeout for
Wait for networkactions. Defaults to15000.
execution.retries- Default retry count for failed tests. Defaults to
0. TestretryPolicyoverrides suitedefaults.retryPolicy, and both override this config value.
execution.skip.tags- Tags skipped in addition to the built-in
skiptag.
execution.only.enabled- Forces only-mode for the selected run. Only tests tagged
onlyor matchingexecution.only.tagsrun.
execution.only.tags- Additional tags treated as only-mode selectors.
Reporting Config
reporting.redact.enabled- Enables redaction in WebTest AI-owned JSON reports, HTML reports, network summaries, console errors, step output, healing traces, healing snapshots, and generated memory proposals. Defaults to
true.
reporting.redact.headers- Header names whose values are replaced with
[REDACTED]when header-like blocks are rendered.
reporting.redact.queryParams- Query parameter names whose values are replaced with
[REDACTED]in URLs.
reporting.redact.patterns- Regex replacements applied to text fields. Each entry supports
name,regex,flags, andreplacement.
reporting.network.excludeUrls- URL substrings omitted from WebTest AI-owned network summaries and HTML trace preview tables.
Privacy boundary:
- Redaction applies to WebTest AI-owned outputs.
- Raw Playwright trace archives may still contain original captured data.
- Use trace retention policy later if raw artifact privacy is required.
Execution Quality Config
execution.quality.a11y.modeoff,warn, orfail. Defaults towarn.
execution.quality.a11y.failOnSerious- When true, serious or critical axe violations create a quality finding. Defaults to
true.
execution.quality.a11y.maxSerious- Allowed serious/critical axe violations before a finding is emitted. Defaults to
0.
execution.quality.a11y.captureArtifactsOnWarning- Reserved policy field for warning-time evidence capture. Defaults to
true.
execution.quality.vitals.modeoff,warn, orfail. Defaults towarn.
execution.quality.vitals.lcpMaxMs- Largest Contentful Paint threshold in milliseconds. Defaults to
2500.
execution.quality.vitals.clsMax- Cumulative Layout Shift threshold. Defaults to
0.1.
execution.quality.vitals.percentiles- Reserved reporting field. Defaults to
["p75"].
execution.quality.vitals.captureArtifactsOnWarning- Reserved policy field for warning-time evidence capture. Defaults to
true.
Quality findings are collected through driver session capabilities. When a quality check is enabled, the active driver must advertise the matching a11y or vitals capability or the test is skipped with a clear missing-capability reason. warn records findings without changing functional status; fail turns an otherwise-passing run into a failed result with qualityFailures.
Execution Artifact And Profile Config
execution.artifacts.traceoff,always, orretain-on-failure. Defaults toretain-on-failure.
execution.artifacts.screenshotoff,always, oronly-on-failure. Defaults toonly-on-failure.
execution.journey.enabled- Enables journey snapshots. Defaults to
false.
execution.journey.capturenavigationcaptures after URL-changing passed actions.stepcaptures after every passed action.offdisables capture.
execution.journey.maxSnapshots- Maximum journey screenshots per test. Defaults to
20.
Runtime profiles are shortcut overlays:
fast: disables traces, journey, a11y, and vitals; keeps screenshots only on failure.ci: keeps traces and screenshots only on failure; leaves quality policy from config.debug: keeps trace and screenshot always, enables step-level journey snapshots, and keeps quality in warn mode.
Driver Config
driver.name- Execution driver name. Built-in values are
cdp,chrome-cdp,playwright, andfakefor tests. The current compatibility default isplaywright; intent-oriented v2 configs should selectcdpexplicitly when Chromium/CDP is the desired primary path.
driver.path- Optional local module path for a custom driver.
driver.package- Optional package name for a custom driver.
driver.require- Capability names that must be supported by the active driver. A run fails early if the driver does not advertise a required capability.
driver.options- Driver-specific options passed through to built-in or custom driver factories.
The built-in CDP driver talks directly to the Chrome DevTools Protocol. Selecting driver.name: "cdp" or "chrome-cdp" means WebTest AI execution calls the CDP driver directly; it does not route actions through Playwright. The CDP driver supports the same Chromium WebTest AI workflow surface as the Playwright driver: open, click, fill, submit, text/URL assertions, semantic outcome assertions, network waits, screenshots, visual checkpoints, CDP JSON traces, axe scans, Web Vitals, popup target switching, cross-origin frame execution contexts, bounded healing snapshots, and reusable browser storage state. It also exposes a text-first accessibility/action surface for intent resolution and semantic outcome evidence. CDP auth state consumes the same saved-state file used by Playwright-compatible auth flows: cookies plus per-origin localStorage, with sessionStorage preserved when present.
The built-in Playwright driver remains available for multi-engine coverage, Playwright trace artifacts, richer debugger workflows, and helper UI-login bootstrapping. Non-Chromium browsers remain Playwright-only because CDP is a Chromium protocol. Custom drivers should export a function, createDriver, or launchDriver, and return a driver object with capability flags and a createSession() method.
Example built-in CDP config:
{
"driver": {
"name": "cdp",
"require": ["actions", "assertions", "screenshots"],
"options": {
"executablePath": "/path/to/chrome"
}
}
}
The CDP driver may use the Playwright package only to locate the bundled Chromium executable when no system Chrome path is configured; execution still goes through CDP, not Playwright APIs.
Existing browser endpoint route:
If an IDE, agent environment, or manually launched Chrome exposes a Chrome DevTools Protocol WebSocket endpoint, WebTest AI can connect through the CDP example driver instead of launching a new browser:
{
"driver": {
"name": "cdp",
"options": {
"wsEndpoint": "ws://127.0.0.1:9222/devtools/browser/<browser-id>"
}
}
}
Endpoint resolution order is driver.options.wsEndpoint, driver.options.endpoint, WEBTEST_AI_CDP_WS_ENDPOINT, then CURSOR_BROWSER_WS_ENDPOINT. Cursor's public browser-tool docs do not currently document a CDP WebSocket endpoint; WebTest AI can drive the Cursor browser only if such an endpoint is exposed.
Capability-aware execution:
actions: open, click, fill, and submit steps.assertions: text and URL assertions.screenshots: explicitCapture screenshotsteps.network:Wait for networksteps.frames: frame-targeted steps. CDP advertisescross-originwhen it can execute through frame-specific runtime contexts.popups: page switching.authState: tests with auth metadata and reusable browser storage state.healingSnapshots: opt-in bounded healing forClick action,Click element,Click text, andClick role.a11ySurface: accessibility-tree action surface used before DOM locator fallback when the driver provides it.a11y: axe-based accessibility scanning.vitals: LCP/CLS collection.
When a test needs a capability that the active driver does not advertise, WebTest AI records the test as skipped with a clear reason.
Healing Config
healing.enabled- Enables bounded runtime healing for supported actions. Defaults to
false.
healing.mode- Current implemented mode is
bounded. The value is reserved for future expansion.
healing.confidenceThreshold- Minimum candidate confidence required for a healed candidate. Defaults to
0.8.
healing.maxAttemptsPerStep- Reserved policy field. Defaults to
1.
healing.snapshot.maxCandidates- Maximum visible interactive candidates captured in one healing snapshot.
healing.snapshot.maxTextLength- Maximum length for candidate text-like fields.
healing.snapshot.maxNearbyTextLength- Maximum nearby context text length per candidate.
Implemented healing applies to Click action "<intent>" and bounded fallback recovery for Click element, Click text, and Click role after deterministic click failure. When a11ySurface is available, click resolution and healing prefer bounded accessibility-tree candidates before DOM fallback; if deterministic ranking is tied or too low-confidence and a model profile is enabled, WebTest AI asks the sidecar to rank only the collected candidates. This lets canonical steps such as Click action "open modal" resolve to branded controls such as Launch preview without writing the exact label into the spec. Click action requires healingSnapshots when healing is enabled; explicit click steps run deterministically on smaller drivers and use healing fallback only when snapshots are available. Healing ranks only candidates collected by WebTest AI or the active driver, refuses unsafe or ambiguous candidates, ignores non-actionable context nodes, and emits reviewed memory/spec patch artifacts after verified recovery.
Memory Config
memory.enabled- Enables generated UI inventory lookup. Defaults to
true.
memory.path- Filesystem JSON inventory path. Defaults to
.webtest-ai/ui-inventory.json.
memory.mode- Inventory write policy.
Supported policy values:
propose: write proposal artifacts after verified healing.auto: write verified healing directly tomemory.path.read: intended read-only mode; current behavior reads memory and still writes proposal artifacts for verified healing.review-required: intended review workflow usingwebtest-ai heal queue,pending,approve-pending, andreject-pending.off: intended off mode; today usememory.enabled: falsefor equivalent behavior.
memory.maxCandidatesPerIntent- Maximum locator candidates retained per app/page/intent.
memory.staleAfterDays- Ignores memory candidates whose
lastSeenAtis older than this many days.
Intent memory uses the global memory.enabled and memory.mode gates, but stores reviewed bounded plans separately from selector inventory.
intent.memoryPath- Filesystem JSON path for reviewed intent-plan memory. Defaults to
.webtest-ai/intent-plans.json.
intent.memoryConfidenceThreshold- Minimum reviewed-plan confidence needed to reuse a bounded plan before calling the model. Defaults to
0.9.
Successful model-generated intent plans emit intent-plans.proposed.json in the test artifact directory when memory is enabled and not off. Review and promote those entries into intent.memoryPath before they can bypass model planning.
Models Config
models.activeProfile- Name of the single active model profile for optional model workflows.
nulldisables model calls.
models.profiles.<name>.provider- Adapter key. Supported local/open-compatible keys include
ollama,openai-compatible,vllm,lmstudio,llama.cpp, andopenrouter-compatible. Legacy flat provider values are accepted temporarily for compatibility, but new config should use profiles.
models.profiles.<name>.model- Model name sent to the provider.
models.profiles.<name>.endpoint- Provider endpoint.
ollamauses<endpoint>/api/chat; OpenAI-compatible adapters use<endpoint>/v1/chat/completionsunless the endpoint already ends with/chat/completions.
models.profiles.<name>.apiKeyEnv- Optional environment variable that contains a bearer token.
models.profiles.<name>.capabilities- Explicit booleans for router/session requirements, including
structuredJson,reasoning,toolCalling,streaming, andvision.
models.profiles.<name>.limits- Request and session bounds such as
timeoutMs,retries,maxInputBytes,maxOutputTokens, andmaxSessionTurns.
models.writePolicy- Roots and extensions enforced before guarded autonomous maintenance applies generated writes. The policy resolves paths against the workspace and blocks traversal or writes outside approved test-related surfaces.
Model-enabled paths record call metadata in run or recording results: purpose, mode, provider, model, profile, timing, message count, prompt byte size, status, and error message when applicable. Prompts, responses, API keys, and auth values are not stored in this metadata.
The sidecar only ranks bounded candidates already produced by WebTest AI. Discovery and maintenance workflows return structured proposals and apply writes only after centralized write-policy checks.
For live-provider smoke testing with Qwen, DeepSeek, Kimi, GLM4, and Yi, see Live Model Testing.
Current Runtime Defaults Not Yet Configurable
These values are implemented as hardcoded runtime defaults today and are good candidates for future config or CLI flags:
- artifacts root:
artifacts/ - reports root:
artifacts/reports/ - auth storage directory:
playwright/.auth/ - popup detection timeout:
1500ms - browser launch executable override for Chromium
- report
environment: alwayslocal
CLI API
Entrypoint:
node ./src/cli/index.js <command> [flags]
Installed package entrypoint:
webtest-ai <command> [flags]
run
Runs one Markdown suite file or a directory of Markdown suite files.
node ./src/cli/index.js run --suite specs/webtest-ai-demo.md --config webtest-ai.config.json
Flags:
--suite <path>: Markdown suite file or directory. Defaults tospecs/.--config <path>: JSON config path. Defaults towebtest-ai.config.json.--browser <name>: browser override. Supported aliases arechromium,chrome,google-chrome,edge,msedge,firefox,webkit, andsafari.--target <a,b>/--targets <a,b>: run selected configured targets fromconfig.targets.--tags <a,b>: include tests that contain all listed tags.--exclude-tags <a,b>: exclude tests containing any listed tag.--workers <n>: parallel worker capacity. Defaults to1.--headed: launch browser headed.--debug: enables Playwright Inspector behavior and pauses before test execution.--refresh-auth: ignore reusable storage state and create fresh UI login state where applicable.--auth-profile <name>: override the auth profile name used for reusable sessions.--profile fast|ci|debug: apply a runtime profile overlay.--trace off|on|always|retain-on-failure: override trace policy for this run.onandalwaysboth retain traces for passing and failing tests.--screenshot off|always|only-on-failure: override final screenshot policy for this run.--quality off: disable a11y and vitals collection for this run.--journey off|navigation|step: override journey snapshot capture for this run.
Run output prints suite name, counts, worker count, run id, config path, JSON report path, and HTML report path.
debug
Shorthand for run --debug.
node ./src/cli/index.js debug --suite specs/webtest-ai-demo.md --tags public
This sets PWDEBUG=1, forces headed behavior through debug mode, and pauses with Playwright Inspector.
heal
Reviews, queues, approves, rejects, or publishes generated UI inventory proposals.
node ./src/cli/index.js heal list --proposal artifacts/<runId>/<testId>/ui-inventory.proposed.json
Subcommands:
list/show: print proposed inventory updates.queue: store proposal updates in the pending review file.pending: list queued pending updates.approve: merge proposal updates into configured inventory.approve-pending: promote queued pending updates into inventory.reject-pending: remove queued pending updates.publish-pr: write local PR metadata stub only.
Flags:
--proposal <path>: proposal JSON path. Required except forpending,approve-pending, andreject-pending.--config <path>: config used to resolvememory.path.--memory <path>: overridememory.pathfor this command.--app <name>: fallback app key for older proposal files.--index <n>: select one pending update.--all: select all pending updates.--patch <path>: optional patch markdown path forpublish-pr.--title <title>: optional PR stub title.--dry-run: previewapproveorapprove-pendingwithout writing inventory or consuming pending review items.
intent-memory
Reviews and approves bounded intent-plan proposals generated by successful intent-mode runs.
node ./src/cli/index.js intent-memory list --proposal artifacts/<runId>/<testId>/intent-plans.proposed.json
node ./src/cli/index.js intent-memory approve --proposal artifacts/<runId>/<testId>/intent-plans.proposed.json --dry-run
Subcommands:
list/show: print proposed intent-plan updates.approve: merge reviewed proposal updates intointent.memoryPath.
Flags:
--proposal <path>: proposal JSON path.--config <path>: config used to resolveintent.memoryPath.--memory <path>: overrideintent.memoryPathfor this command.--dry-run: previewapprovewithout writing intent memory.
discover
Runs a bounded model discovery workflow and emits candidate test-flow proposals. It uses the active model profile and does not affect deterministic run pass/fail truth.
node ./src/cli/index.js discover --url https://app.example --dry-run
node ./src/cli/index.js discover --url https://app.example --output artifacts/discovery/app.json
node ./src/cli/index.js discover --target brand-a-en --target brand-b-fr --dry-run
node ./src/cli/index.js discover --targets brand-a-en,brand-b-fr --dry-run
node ./src/cli/index.js discover --target all --explore --max-pages 5 --inventory-output artifacts/discovery/ui-inventory.proposed.json --spec-output artifacts/discovery/discovered.md
node ./src/cli/index.js discover --target all --explore --auth-profile ci-user --refresh-auth --dry-run
Flags:
--url <seed-url>/--seed <seed-url>: seed URL. May be repeated.--target <name|all>/--targets <name,name|all>: expand configured targets into seed URLs and pass targetbrand,locale, andintentAliasesinto discovery.--explore: open seed URLs with the active driver, crawl same-origin links within budget, and include bounded page evidence in discovery.--max-pages <n>: maximum pages to visit during browser-backed exploration. Defaults to5.--max-depth <n>: same-origin link depth for browser-backed exploration. Defaults to1.--browser <name>: browser engine or channel for browser-backed exploration.--auth-profile <name>: prepare reusable/API auth state for browser-backed exploration and use the named profile.--auth-mode <mode>: override configured auth mode for browser-backed exploration.--refresh-auth: ignore reusable auth state and create fresh state where applicable.--config <path>: JSON config path. Defaults towebtest-ai.config.json.--max-turns <n>: override the active profile session turn limit.--output <path>: write the normalized discovery proposal JSON.--inventory-output <path>: write proposed UI inventory updates when discovery returns them.--spec-output <path>: write proposed flows as reviewable Markdown.--dry-run: print the proposal summary without writing an artifact.
When memory is enabled, discovery reads the configured UI inventory path and includes it as reviewed context for proposed flows. With --explore, discovery also includes live browser evidence: URL patterns, page fingerprints, accessibility/action candidates, and same-origin links. Auth flags prepare browser storage state before exploration so authenticated pages can be inventoried without exposing credentials to the model. Discovery still returns proposals only; it does not decide deterministic run pass/fail truth, and UI inventory changes are written only as reviewable proposal artifacts.
record
Records a manual browser flow and generates a Markdown spec plus recording artifacts. Markdown is the source-of-truth contract.
node ./src/cli/index.js record --url https://app.example --output specs/recorded-flow.md
Flags:
--url <url>: URL to open and record. Required.--output <path>: Markdown spec output path. Defaults tospecs/recording-<timestamp>.md.--emit-starter: also write an optional Playwright starter test for migration/debugging.--starter <path>: starter-test output path. Passing this flag implies--emit-starter.--event-log <path>: redacted event-log output path.--api-manifest <path>: redacted API/network manifest output path.--inventory-proposal <path>: UI inventory proposal output path.--name <name>: generated test name.--tags <a,b>: generated suite/test tags.--headless: run browser headless for scripted capture. Recorder defaults to headed.--force: overwrite generated Markdown and optional starter files.--save-auth: savestorageStateafter recording.--auth-profile <name>: profile name for saved auth state. Defaults torecorded-user.--record-auth-flow: include login/MFA-like steps as env placeholders. Without this, recorder trims obvious username/password/MFA segments and prefers saved-session reuse when--save-authis used.--capture-network manifest|full: writes a redacted API manifest.fullcurrently adds a privacy warning; request/response bodies are not stored by the MVP.--smart: optionally clean up goal, tags, steps, and expected assertions with one model call when model config is enabled. Deterministic output remains the fallback.
Recorder outputs:
- Markdown spec
- optional Playwright starter test when
--emit-starteror--starteris used - redacted event log
- redacted API manifest with
Wait for networksuggestions - reviewable
ui-inventory.proposed.jsonselector-memory proposal - optional saved auth state when
--save-authis used
Interactive auth/MFA policy:
- The user enters live MFA codes in the headed browser during recording.
- Passwords, OTPs, tokens, and similar values are redacted from event logs.
- Generated login-flow specs use env placeholders such as
WEBTEST_AI_PASSWORDandWEBTEST_AI_MFA_CODE. - Post-login business-flow specs should use saved
storageStatereuse instead of replaying MFA.
report
Exports WebTest AI report data for external integrations. The command is vendor-neutral and does not call Jira, Slack, MCP servers, or other remote services.
node ./src/cli/index.js report export --report artifacts/reports/<runId>.json
node ./src/cli/index.js report summary --report artifacts/reports/<runId>.json --format markdown
Flags:
--report <path>: source WebTest AI JSON report. Required.--output <path>: export path. Defaults beside the source report.--format json|ndjson: export output format. Defaults tojson.--format markdown|text|json: summary output format. Defaults tomarkdown.--stdout: write export content to stdout instead of a file.--force: overwrite an existing export file.
Exported JSON shape:
versionkind: "webtest-ai.integration-report"source: source report metadatarun: run metadata, filters, scheduling, browser, and driversummary: status counts and run success booleantests: compact per-test records with status, tags, timing, failure reason, quality findings, model-call metadata, counts, and artifact paths
Use this output to build project-specific integrations, CI annotations, dashboards, MCP resources/tools, Jira tickets, Slack messages, or any other workflow outside WebTest AI core.
CI summary output:
kind: "webtest-ai.ci-summary"for JSON summaries- status counts and success boolean
- failed, blocked, and healed tests that need attention
- first failed step and artifact paths when available
mcp
Starts the WebTest AI MCP server over stdio.
node ./src/cli/index.js mcp
Flags:
--config <path>: config used by MCP tools that need WebTest AI runtime policy.
The MCP server exposes WebTest AI data and operations. It does not call vendor services or publish to external systems.
The browser-session MCP tools use Playwright Chromium directly because they are explicit debug and recording helpers. They are separate from deterministic webtest-ai.run.suite execution, which still goes through the configured driver boundary and owns pass/fail truth.
Resources:
webtest-ai://reports/latestwebtest-ai://reports/latest/integrationwebtest-ai://reports/{name}webtest-ai://reports/{name}/integrationwebtest-ai://specswebtest-ai://specs/{path}webtest-ai://artifacts/{path}webtest-ai://schemas/integration-report-v1
Tools:
webtest-ai.run.suitewebtest-ai.report.listwebtest-ai.report.summarywebtest-ai.report.exportwebtest-ai.report.ci_summarywebtest-ai.spec.parsewebtest-ai.browser.start_sessionwebtest-ai.browser.clickwebtest-ai.browser.fillwebtest-ai.browser.assert_textwebtest-ai.browser.snapshotwebtest-ai.browser.screenshotwebtest-ai.browser.to_specwebtest-ai.browser.stop_session
Browser MCP sessions are for explicit debug, recording, and agent-assisted exploration. webtest-ai.browser.snapshot returns accessibility-tree candidates with session-local ax* refs when the browser exposes an a11y snapshot. webtest-ai.browser.click can target those refs with ref, or continue to use selector, role/name, or text inputs. Normal CI pass/fail truth still belongs to deterministic WebTest AI runs.
webtest-ai.run.suite runs a Markdown suite through the same deterministic runner used by the CLI and writes normal JSON/HTML reports. Runner progress logs are sent to stderr so MCP stdout remains valid protocol output.
Reserved Commands
No SaaS-specific publisher commands are built into core.
Programmatic recorder APIs exist for recordFlow(), generateSpec(), and normalized recorder output. Report export APIs exist for buildIntegrationExport(), writeIntegrationExport(), and publishResults(). publishResults() returns the same vendor-neutral export payload and does not publish to an external service.
Recommended Future CLI Overrides
These flags are not implemented yet, but they are the natural split between specs and runtime operation:
--base-url <url>--env <name>--timeout <ms>--navigation-timeout <ms>--report-dir <path>--artifacts-dir <path>--healing on|off--memory-mode propose|auto|read|off--run-id <id>
Markdown Suite API
One Markdown file is one suite. YAML front matter defines suite metadata. Each top-level # heading defines one test.
Suite Front Matter
Current supported fields:
suite- Suite name used for run ids and reports. Defaults to
unnamed-suite.
app- App key used by generated UI inventory. If omitted, WebTest AI derives an app key from
baseUrlor suite name.
baseUrl- Base URL used by relative
Opensteps and UI login. Currently required for relative paths. This is a strong candidate to move to config/CLI.
tags- Tags inherited by every test in the suite.
defaults- Parsed and copied into each normalized test, but not heavily used by runtime yet.
hooks- Suite-level hooks.
hooks.beforeEachandhooks.afterEachare prepended/appended to each normalized test when present.
auth- Suite-level auth policy. Test metadata can override it. This is a strong candidate to move to config auth profiles.
mcpProfile- Reserved integration profile. Defaults to
default.
browsers- Browser list. Current runtime uses only the first suite browser.
modelMode- When set to
intent,## Goalis treated as executable and planned by the bounded model sidecar before## Expectedruns.
targets- Optional list of configured target names to run for the suite.
Test Metadata
Metadata appears between a test heading and the first ## section.
Supported fields:
tags- Test tags. Suite tags and test tags are combined and deduplicated.
priority- Reporting metadata. Defaults to
normal.
owner- Reporting metadata. Defaults to
null.
auth- Per-test auth override merged over suite auth.
browsers- Parsed into normalized test objects, but current runtime does not launch per-test browsers.
modelModeintentenables executable goal planning for this test. Test metadata overrides suitemodelMode.
targets- Optional list of configured target names for this test. Test metadata overrides suite targets unless CLI
--targetis supplied.
retryPolicy- Parsed and enforced by runtime. A test-level value overrides
defaults.retryPolicy, and both overrideexecution.retriesfrom config.
Sections
Supported section headings:
## Goal## Before## BeforeEach## Steps## Expected## AfterEach## After## Data## Notes
Steps, Expected, hooks, and Notes become lists. Goal becomes one text string. Data becomes key/value entries split on :.
Hook sections are supported for setup and cleanup:
## Before: runs before the test's main steps.## BeforeEach: suite-level hook in frontmatter ashooks.beforeEach, or a test section when needed.## AfterEach: suite-level hook in frontmatter ashooks.afterEach, or a test section when needed.## After: runs after expected assertions.
Hooks use the same step language as ## Steps. Keep hooks short and visible; business behavior should stay in ## Steps and ## Expected.
## Steps and ## Expected may use numbered/bulleted lists or a one-column Markdown table with a Step header:
| Step |
| --- |
| Open "/" |
| Click role "button" named "Run" |
## Data may use key/value lines or a Markdown table. Use name: env ENV_VAR to approve an environment-backed value for intent-mode filling without putting the secret in Markdown. CSV files are not part of this pass; add them later only as an explicit data-file expansion feature, not as a replacement for the Markdown contract.
Step Language API
The interpreter is rule-based. Unsupported text fails with Unsupported step.
Supported step forms:
Open "<url-or-path>"Click action "<intent>"Click element "<name>"Click text "<text>"Click role "<role>" named "<name>"Click frame role "<role>" named "<name>"Click frame "<frame-name-or-url-part>" role "<role>" named "<name>"Fill field "<label>" with "<value>"Fill field "<label>" with env "<ENV_VAR>"Fill intent "<field-intent>" with data "<data-key>"Fill intent "<field-intent>" with env "<ENV_VAR>"Fill frame field "<label>" with "<value>"Fill frame field "<label>" with env "<ENV_VAR>"Submit formSubmit form with button "<name>"Assert text "<text>"Assert outcome "<semantic-outcome>"Assert frame text "<text>"Assert frame "<frame-name-or-url-part>" text "<text>"Assert url "<url>"Wait for url "<url>"Wait for network "<url-part>"Wait for network "<url-part>" status <status>Switch page "<alias-title-or-url-part>"Capture screenshotCapture screenshot as "<fileName>"Capture visual checkpoint "<name>"
Current locator behavior:
Click roleresolves through the driver accessibility-tree surface first when available, then falls back to PlaywrightgetByRole(role, { name, exact: false }).Click textresolves through the driver accessibility-tree surface first when available, then falls back togetByText(text, { exact: false }).first().Fill fieldusesgetByLabel(label, { exact: false }).Fill intentfills only approved values from## Dataor env refs and resolves the field intent through the driver/accessibility surface when exact label matching is not enough.Click elementresolves through the driver accessibility-tree surface first when available, then tries button role, link role, label, placeholder, then text.Assert outcomeasks the configured model profile to evaluate a human-written semantic outcome against bounded page evidence. It must return a matching result aboveintent.outcomeConfidenceThreshold;Assert textremains the exact text check.Submit formcallsrequestSubmit()on the first form unless a button name is supplied.- Frame steps choose the first non-main frame unless a frame name or URL part is supplied.
Switch pagechanges the active page after a popup or additional page exists. Drivers match stable aliases such asmain,page-1, andpopup-1, then exact page titles, then a unique URL substring; ambiguous or missing matches fail the step.- Popup-producing clicks are followed by setting the active page when a popup is detected within
1500ms. - Visual checkpoints capture a full-page screenshot, create a missing baseline, and compare later runs with exact image hash matching.
Visual Checkpoint Config
visual.baselineDir- Directory for visual checkpoint baselines. Defaults to
visual-baselines.
visual.updateBaselines- When
true, visual checkpoint actions overwrite existing baselines with the current screenshot.
visual.modefailorwarn. Defaults tofail. Inwarnmode, changed visual checkpoints write diff metadata without failing the step.
Visual checkpoint artifacts:
- actual screenshot:
artifacts/<runId>/<testId>/visual-<name>.png - diff metadata:
artifacts/<runId>/<testId>/visual-<name>.diff.json - baseline:
<visual.baselineDir>/<suite>/<testId>/<name>.png
Auth API
Current auth metadata fields:
mode: commonlyui,reuse, orapi.profile: storage-state profile name.saveStorageState: whentrue, passing UI auth tests persist storage state.usernameEnv: username environment variable.passwordEnv: password environment variable.mfaCodeEnv: MFA code environment variable.otp: optional OTP provider config. Supported providers aretotp-secretandstatic-env.validationPath/validationUrl: reusable-session validation target.validate: setfalseto skip reusable-session validation.apiLoginPath/apiLoginUrl: API endpoint used bymode: api.method: API login method, defaulting toPOST.bodyFormat/contentType:jsonorform, defaulting tojson.usernameField: API login username field, defaulting tousername.passwordField: API login password field, defaulting topassword.mfaCodeField: API login MFA field, defaulting tomfaCode.extraFields/body: extra API login request fields.headers: extra API login request headers.successStatus/successStatuses: accepted API login response status codes.requireCookie: setfalseto allow API login responses withoutSet-Cookie.loginMode:same-hostoridp.provider:idpenables IDP login behavior.loginPath: login page path.successPath: URL waited for after UI login.successText: text waited for after UI login.usernameLabel: username field label.passwordLabel: password field label.submitName: login submit button name.mfaLabel: MFA input label.mfaSubmitName: MFA submit button name.mfaTimeout: how long to look for the MFA field.idpStartName: IDP start link name, orfalseto skip that click.
OTP examples:
auth:
mode: reuse
profile: ci-user
usernameEnv: APP_USERNAME
passwordEnv: APP_PASSWORD
otp:
provider: totp-secret
secretEnv: APP_TOTP_SECRET
digits: 6
period: 30
auth:
otp:
provider: static-env
codeEnv: APP_MFA_CODE
mfaCodeEnv remains supported for compatibility and behaves like static-env.
Current storage state path:
playwright/.auth/<profile>.json
Recommended direction:
- Keep auth flow definitions in config.
- Let specs reference a logical profile or auth requirement.
- Use CLI for one-off overrides such as
--auth-profileand--refresh-auth.
API login creates the same browser-ready storage state shape as UI login. The default endpoint is /api/login, and cookies returned by Set-Cookie are written to playwright/.auth/<profile>.json. Drivers must advertise authState to consume that state during execution. The CDP driver can also save storage state from the active page, including matching-origin cookies, localStorage, and sessionStorage.
Reusable session validation opens the saved storage state in a browser context when the active driver exposes a native browser for validation, and treats an OK response from validationPath or validationUrl as a valid session. CDP execution can apply existing reusable state directly; validation is skipped when no native validation browser is available. The endpoint must return a non-OK response for anonymous users; otherwise WebTest AI cannot distinguish a real session from public access.
Report And Artifact API
Per run:
artifacts/reports/<runId>.jsonartifacts/reports/<suite>-latest.jsonartifacts/reports/<runId>.htmlartifacts/reports/<suite>-latest.html
Per test:
artifacts/<runId>/<testId>/final.pngartifacts/<runId>/<testId>/trace.zipartifacts/<runId>/<testId>/trace/- optional checkpoint screenshots
- optional journey screenshots
- optional healing snapshots
- optional UI inventory proposal and patch files
- optional reviewed spec patch file
Result status values:
passedfailedhealed-passedskippedblocked
Important report fields:
runIdsuitesuitePathgeneratedAtbrowserdriverdriverCapabilitiesworkerCountschedulingconfigPathfiltersresults
Important per-test result fields:
testIdnamestatusbrowserbrowserEnginebrowserChannelbrowserVersiondriverNamedriverCapabilitiesenvironmenttagspriorityownerworkerIdserialauthModeauthProfileauthStatemodelModemodelCallstargetintentPlansemanticAssertionshealingjourneydurationMsassertionsstepResultsfirstFailedStepnetworkSummaryconsoleErrorsa11yvitalsqualityFailuresqualityWarningsartifactssummary
The HTML report renders suite-level driver metadata, capability tags, per-test owner/priority, target metadata, generated intent plans, semantic assertion evidence, journey snapshots, artifact availability, healing traces, and skip/block reasons. This makes adapter support and degraded execution visible without opening raw JSON.
Manual Feature Validation Matrix
Use these checks when validating a local platform pass:
- Recorder capture: run
node ./src/cli/index.js record --url http://127.0.0.1:4010 --name "Manual recorded smoke" --tags recorded,smoke --output specs/manual-recorded-smoke.md --force, interact in the opened headed browser, then confirm the Markdown includes clicks/inputs/page opens and no.playwright.jsis created. - Optional starter: rerun recorder with
--emit-starteror--starter specs/manual-recorded-smoke.playwright.jsand confirmStarter Test:prints only then. - Dashboard journey: run a suite with
--journey navigationor--profile debug, openartifacts/reports/<suite>-latest.html, and confirmUser Journey, owner, priority, and artifact policy are visible. - Fast profile: run
node ./src/cli/index.js run --suite <suite.md> --profile fast; confirm traces are off, quality is skipped, and screenshots are retained only on failure. - Visual checkpoint: run a suite containing
Capture visual checkpoint "Hero"twice; confirm the first run creates a baseline undervisual-baselines/and the second run matches or reports a diff according tovisual.mode. - Auth reuse validation: use
auth.mode: reusewithvalidationPath: /api/session; confirm a saved valid state is reused and a stale state falls back to login or fails clearly. - Hooks and tables: parse/run a spec with
## Before,## Expected, and table-form## Steps; confirm hook phases appear instepResults. - Healing review: run
node ./src/cli/index.js heal approve --proposal examples/healing-review-proposal.json --dry-run --memory /tmp/webtest-ai-ui-inventory.json; confirm it printsWould Apply: 1and does not write inventory. - Intent memory review: run
node ./src/cli/index.js intent-memory approve --proposal artifacts/<runId>/<testId>/intent-plans.proposed.json --dry-run --memory /tmp/webtest-ai-intent-plans.json; confirm it printsWould Applyand does not write memory. - Command surface: run
node ./src/cli/index.js --helpandnode ./src/cli/index.js <command> --helpforrun,debug,record,heal,intent-memory,discover,report, andmcp. - Published package: run
npm view @asserthive/webtest-ai@beta version dist-tags bin --jsonand confirm the package exposes thewebtest-aibinary. - Existing CDP endpoint: set
WEBTEST_AI_CDP_WS_ENDPOINT=ws://...and useexamples/drivers/existing-cdp-driver.config.json; if no endpoint exists, this feature cannot drive that browser.
Programmatic API
Package entrypoint:
const {
parseSuite,
runSuite,
runTest,
createBrowserDriver,
createFakeDriver,
getDriverConfig,
recordFlow,
generateSpec,
normalizeRecordedEvents,
renderMarkdownSpec,
renderStarterTest,
buildIntegrationExport,
buildCiSummary,
exportResults,
renderCiSummary,
writeCiSummary,
writeIntegrationExport,
runIntentPlanner,
assertSemanticOutcome,
findIntentPlanMemory,
loadIntentMemory,
writeIntentMemoryProposal,
createMcpServer,
publishResults
} = require("@asserthive/webtest-ai");
parseSuite(filePath)
Reads a Markdown suite and returns a normalized suite object.
const suite = await parseSuite("specs/webtest-ai-demo.md");
Returned shape:
{
filePath,
suite,
app,
baseUrl,
tags,
defaults,
auth,
mcpProfile,
browsers,
tests
}
runSuite(parsedSuite, options)
Runs a parsed suite through the configured browser driver and writes JSON/HTML reports. Playwright is the default driver.
Common options:
{
headless: true,
debug: false,
workers: 1,
includeTags: [],
excludeTags: [],
refreshAuth: false,
authProfile: null,
suitePath: "/absolute/path/to/spec.md",
config: {},
configPath: "/absolute/path/to/webtest-ai.config.json"
}
Returns:
{
runId,
suite,
workerCount,
configPath,
reportPath,
htmlReportPath,
results
}
runTest(test, options)
Runs one normalized test. This is lower-level than runSuite() and expects either a driver object or a launched Playwright browser plus runtime options.
Important options:
driverbrowserbrowserNamebrowserEnginebrowserChannelbrowserVersionrunIdworkerIdsessionCachesuiteincludeTagsexcludeTagsrefreshAuthauthProfiledebugworkerCountsuitePathconfig
Driver helpers:
createBrowserDriver(options): resolves and launches the configured driver.createFakeDriver(options): creates an in-memory driver for tests and adapter-contract checks.getDriverConfig(config): returns the normalized driver config used by runtime driver resolution.tests/support/driverContract.js: reusable local contract helper for checking that a driver can execute the core open/fill/click/assert/screenshot flow.
Recorder APIs
recordFlow(options)- Captures a browser flow and writes the same Markdown/event/API/inventory artifacts as
webtest-ai record.
generateSpec(recording, options)- Converts normalized recording events into a Markdown spec model.
normalizeRecordedEvents(events, options)- Converts raw recorder events into deterministic steps and suggestions.
renderMarkdownSpec(spec, options)- Renders a generated spec model as Markdown.
renderStarterTest(spec, options)- Renders the optional Playwright starter test used during migration/debugging.
Integration Export APIs
buildIntegrationExport(report, options)- Converts a WebTest AI JSON report object into a compact vendor-neutral integration payload.
buildCiSummary(report, options)- Converts a WebTest AI JSON report object into a compact CI summary payload.
renderCiSummary(summary, options)- Renders a CI summary as Markdown, text, or JSON.
writeCiSummary(summary, outputPath, options)- Writes a rendered CI summary.
writeIntegrationExport(exportPayload, outputPath, options)- Writes the integration payload as JSON or NDJSON.
exportResults(reportOrPath, options)- Returns the vendor-neutral integration payload and optionally writes it.
publishResults(reportOrPath, options)- Compatibility alias for
exportResults(). It does not call external services.
runIntentPlanner(options)- Builds and validates a bounded plan for executable
## Goalflows.
assertSemanticOutcome(options)- Evaluates
Assert outcomeagainst bounded page evidence using the configured model profile.
findIntentPlanMemory(options)- Looks up a reviewed high-confidence bounded plan before model planning.
loadIntentMemory(config)- Reads the configured intent-plan memory file.
writeIntentMemoryProposal(options)- Writes reviewable intent-plan memory proposal artifacts.
createMcpServer(options)- Creates the WebTest AI MCP server object for stdio hosts or direct tests.
Boundary Guidance
Keep in Markdown specs:
- user-visible behavior
- business flows
- assertions
- scenario-specific data
- owner/priority/tags used for reporting and selection
Move or keep in config:
- base URLs and environment names
- auth profile details and secret env var names
- default browser/project
- execution driver and required capabilities
- timeouts
- artifact/report directories
- trace and screenshot retention
- reporting privacy
- healing, memory, and model policy
- scheduling policy defaults
Use CLI for:
- choosing a suite
- selecting tags
- changing config path
- changing worker count
- headed/debug runs
- refreshing auth
- one-off runtime profile and browser overrides
- future environment/base-url overrides when those flags are added