Anvil Desktop / Working guide

Agent workflows

Anvil Desktop is built around repo-aware conversations. A useful conversation starts from code and ends with evidence.

Session types

Session	Use it for	Evidence to keep
Plan	Understand current behavior, identify files, map acceptance criteria, and choose a small implementation path.	Files inspected, assumptions, risks, proposed checks.
Implement	Make scoped code changes in a target repo.	Diff summary, files changed, tests run, unresolved risk.
Review	Inspect current changes or a branch for correctness, regressions, security, accessibility, and missing tests.	Findings with file references, severity, and follow-up actions.
Security	Check auth, permissions, data handling, dependency risk, secrets, and unsafe execution paths.	Findings, affected surfaces, exploit notes where useful, verification limits.
Docs	Generate or update README, architecture notes, ADRs, setup docs, and handover material.	Source files used, claims made, commands verified.
BA	Compare work item intent with implementation, acceptance criteria, risk, dependencies, and rollout questions.	Questions, gaps, feasibility notes, follow-up work.
Handover	Summarize what changed and what a teammate needs to know.	Change summary, checks, manual QA, skipped checks, residual risk.

A good implementation loop

Open the workspace.
Confirm the target repository, branch, and work item.
Ask for the current code path before asking for edits.
Keep the change scoped to the acceptance criteria.
Review the diff.
Run the narrowest meaningful checks.
Escalate to review, security, dependency, CI, or docs checks when the risk warrants it.
Record what passed and what was not verified.

The read-only first step is not ceremony. It prevents the model from confidently editing the wrong abstraction, which remains legal in TypeScript but rude in production.

Mix providers inside a workflow

Settings separates the primary agent from the providers that are active:

The primary provider starts new chats and handles app-level AI tasks.
Active providers appear in each workflow step's provider selector.
Each step stores its own provider and model, so planning, implementation, and independent reviews do not have to use the same runtime.
Cursor model choices come from the locally installed cursor-agent catalog. Custom Codex, OpenAI, and Azure model or deployment identifiers can be entered directly.
Templates created before provider-aware workflows continue to use Codex unless you change the step.

Codex, OpenAI, and Azure workflow steps run through Codex app-server with the selected model-provider route. Cursor workflow steps run through cursor-agent with the selected Cursor model. A workflow fails plainly if a saved step names a provider that is no longer active.

Agents can also call another installed provider's CLI from a prompt when that is the simplest handoff. Provider-aware workflow steps are preferable when the runtime choice should be visible, repeatable, and stored with the template.

Follow work across workspaces

Starting work in one workspace does not make the rest of the app a waiting room:

Active conversations continue when you switch workspaces.
The activity centre shows running work and conversations waiting for approval or input.
Desktop notifications open the originating workspace and exact thread.
Completion notifications stay quiet while Anvil is focused; approval and input can still surface because they block progress.
Workspace terminal processes and buffered output remain available while the desktop process is running.

Notifications are navigation, not approval shortcuts. Open the thread and review the request in context before allowing work to continue.

What to include in prompts

Good session prompts include:

target repository
work item or acceptance criteria
files or modules already suspected
constraints such as no broad refactor, docs-only, tests required, or read-only
validation expectations
external systems involved, such as GitHub, Linear, Jira, Azure DevOps, Notion, Confluence, or Figma

Avoid prompts that ask the agent to "make it better" with no boundary. That is not a requirement, it is a haunted treasure map.

Review stance

Review sessions should lead with:

correctness
security and auth
data integrity
accessibility
production risk
missing tests

Maintainability and style matter, but they should not bury a real blocker under opinions about naming.

Handover quality

A useful handover says:

what changed
why it changed
where it changed
what passed
what failed or could not be run
what a reviewer should inspect first
what risk remains

The goal is not to produce a prettier transcript. The goal is to leave enough context that the next person can continue without spelunking.

Chat personas, reasoning, and LLM providers Git workflows