The dashboard is for humans, the CLI is for agents. We built TesterArmy's interface so a coding agent can hand off QA work the same way it would to a strong engineer - turning testing from a one-off prompt into something repeatable.
Oskar KwasniewskiCTO
May 17, 20265 min read
Over the past few months, we've been building TesterArmy around a simple idea: agents should be able to hand off QA work the same way a human team would hand it off to a strong QA engineer.
A lot of people ask, why not just use Cursor, Claude Code, Codex, or your own agent to do QA?
You can try. For simple checks, it might even work sometimes. But real QA is not just opening a browser and clicking around. It requires setup, clear test definitions, run history, screenshots, CI integration, mobile builds, cleanup, and, most importantly, reproducible results.
That's the part people tend to underestimate.
If you send an agent in YOLO mode, the same test can take a different path every time. It might click a different button, interpret the goal differently, or report a different result. That's not good enough for QA. A test should be something you can run again and compare.
So the goal is not to replace TesterArmy with a general-purpose agent. The goal is to let a general-purpose agent hand off testing to a specialized testing platform.
It's the same idea as with people. When someone has a good interface for using a product, like a well-designed mobile app, they're more likely to use it often. If the product is clunky or full of friction, they'll avoid it.
For agents, the CLI is that interface.
The dashboard still matters for humans. But an agent should not have to click through the dashboard just to list tests, start a run, wait for results, or inspect failures. Those are product operations, not QA work.
Agents need:
Stable commands.
Predictable JSON output.
Clear error messages.
Non-interactive execution.
IDs and pagination that they can pass into the next command.
For a human, this is fine:
ta projects
ta groups list --project <projectId>
For an agent, this is better:
ta projects list --json
ta groups list --project <projectId> --json
The difference looks small, but it matters. The agent does not need to parse a table, guess column names, or hope the spacing does not change. It reads JSON and moves on.
Give agents workflow primitives
A good testing interface should let an agent operate the testing system without pretending to be a human in the dashboard.
For example, after a coding agent changes a feature, it should be able to run the relevant dashboard test group directly:
ta tests run \
--group <groupId> \
--project <projectId> \
--remote \
--wait \
--json
That command gives the agent a clear handoff point. TesterArmy owns specialized QA work, including browser or simulator execution, stored test steps, artifacts, run state, and result classification.
If it needs more context, it can inspect the run history:
ta runs list --project <projectId> --test <testId> --status completed --json
This is much easier for an agent to reason about than clicking through a dashboard, finding the right group, pressing run, waiting, refreshing, and scraping the result from the UI.
The UI is still what TesterArmy tests. The CLI is how other agents control TesterArmy.
Delegation is the real feature
The CLI matters because it makes delegation cheap.
If a coding agent can find the right project, trigger a test group, wait for results, inspect failures, and summarize what changed, you are more likely to keep tests up to date.
That is where many testing setups fail. Creating tests once is easy. Maintaining them as the product changes is the hard part.
With an agent-friendly CLI, the workflow can look like this:
A coding agent changes the app.
It asks TesterArmy to update or run the relevant tests.
TesterArmy executes those tests with specialized QA infrastructure.
The coding agent reads structured results and acts on them.
That is much closer to handing work to a professional than sending a random agent into a browser and hoping it comes back with something useful.
What we learned
Building for agents forced us to clean up our product surface.
When you ask, "Could an agent use this?", you quickly find the vague parts:
Dashboard-only actions.
Interactive-only commands.
Pretty output that cannot be parsed.
Internal APIs that should be stable workflows.
Errors that humans can interpret but agents cannot.
Fixing those things helps agents, but it also helps humans. A predictable CLI is better for CI, support, internal scripts, docs, and power users.
That's a wrap
If you are building software for agents, do not focus only on prompts and tools. Think about the interfaces agents need to use your product.
For us, that interface is the CLI.
And if you are wondering why you should not just use your own agent for QA, you can try. But the hard part is not making an agent click once. The hard part is making testing repeatable, maintainable, debuggable, and integrated into your team's shipping process.
If this resonates, try TesterArmy and see what QA feels like when the agent has a real testing system behind it, rather than a one-off prompt: sign in.
March 1, 2026
Why Testing Is Important
Shipping fast without testing looks efficient until one regression slows the whole team. Here's why consistent testing changes that.