Score user stories with INVEST, get AI-powered analysis, improve acceptance criteria, and generate tests in 8 frameworks — all inside ADO. Your data never leaves your tenant.
Open any work item. INVEST score in under a second — offline, no API call needed.
Run LLM analysis. Criterion-level feedback, improve description and AC, side-by-side diff before writeback.
Manual tests with Gherkin per typology. Push to ADO via Tested By — no Test Plans licence required.
Generate test code in your framework. 5-dimension quality score shown before push to Repos.
Heuristic score on every work item panel load. Offline-first — no API key required for the score itself.
PO / BACriterion-level INVEST analysis with suggestions linked directly to I, N, V, E, S, or T.
PO / BAImproved description and AC — bullets or Gherkin per AC — written back to the work item via ADO API.
PO / BAGiven/When/Then per typology. Pushed via Tested By — no Test Plans licence required.
QAPlaywright TS/Python/.NET, Selenium Java, Cypress JS, Cucumber. Code quality score before push.
QABYOK — all LLM calls go browser-direct to your endpoint. Zero data to TestForge servers.
PlatformDesigned for regulated industries. Every architecture decision starts with data sovereignty.
TestForge servers never receive your user stories, AC, or generated tests.
All AI requests go directly from your browser tab to your endpoint, under your credentials.
Your LLM config is stored in ADO Extension Data Service, scoped to your user account.
Azure Function or Cloudflare Worker — deployed in your own tenant. Fully auditable.
vso.work_full, vso.code_write, vso.extension.data_write — nothing more.
Air-gapped deployment with Ollama coming 2027 for organisations with no cloud connectivity.
LLM token cost ≈ €2/sprint already included.
Xray claims 60 tests. We generate the right number per typology, controlled by you. A QA expert knows 8 well-targeted tests beat 30 generic ones.
From US quality score to Gherkin to Playwright code — inside ADO. No context switching, no duplicate tooling.
BakeQA and CasePilot process your stories on their servers. TestForge never does. For banking, public sector, healthcare — this is the only option.
Coming Q4 2026 — track INVEST score evolution sprint-by-sprint. The only metric you can present at steering committee level.
You pay for TestForge. LLM token costs go directly to your provider — typically €2/sprint. No hidden fees.
Unlimited heuristic scoring. 10 LLM analyses/month.
Up to 15 named users. All projects.
Unlimited users. All 8 frameworks.
On-prem, SSO Azure AD, regulatory templates.
Install TestForge free on the Azure DevOps Marketplace. No credit card. Your data, your tenant, your rules.
Install free on Marketplace →Also available: Documentation · Contact sales