Skip to content

Workflow generation (Feature 015 US10)

Agentweaver can generate a complete workflow definition from a plain-language description. A user clicks Generate workflow on the project Workflows page, describes the pipeline they need, and the server returns a validated WorkflowDefinition YAML draft that opens in the workflow editor for review and an explicit save. Nothing is written to .agentweaver/workflows/ until the user saves.

This document covers the server-side generation capability behind POST /api/projects/{id}/workflows/generate (FR-056–FR-061).

Components

PieceResponsibility
IWorkflowGeneratorThe seam: GenerateAsync(WorkflowGenerationRequest) → WorkflowGenerationResult. Returns a draft; never persists.
CopilotWorkflowGeneratorProduction implementation: builds the prompt, calls GitHub Copilot via IAgentRunner, validates, and runs one correction pass.
WorkflowDefinitionLoaderValidates the model output with the same schema/structural rules the runtime loader enforces.
WorkflowDefinitionEndpointsHosts the POST .../workflows/generate endpoint; resolves the project's cast roles and maps results/errors to HTTP.

All prompt construction, schema context, and LLM invocation live server-side (FR-057). The client only sends a description and renders the returned YAML.

Endpoint

POST /api/projects/{id}/workflows/generate
Body: { "description": "string" }
→ 200 { "yaml": string, "workflowId": string, "wasCorrected": bool }
→ 400 { "error": string }   // description missing, or generation failed after the correction pass
→ 404                       // project not found
→ 403                       // caller is not the project owner

The response YAML is a draft — identical content is returned to the MCP server and the Web UI (FR-059). The model provider is fixed to GitHub Copilot (Principle II).

Prompt design (FR-057)

The generation prompt is assembled in CopilotWorkflowGenerator.BuildPrompt and contains:

  1. Schema description — the top-level keys (id, name, description, version, trigger, start, nodes, edges) and their required-ness.
  2. Node-type vocabulary with runtime semanticsagent/prompt, peer_review/review, check, merge, scribe, serial, fan_out, fan_in, rai, terminal, each with a one-line description of what it does at runtime.
  3. Validation rules — required fields, edge/start node-reference integrity, check nodes needing branches: with a matching outgoing edge per verdict, and serial steps: referencing real nodes. These mirror WorkflowDefinitionLoader so the model is guided toward output that will validate.
  4. Available roles — the project's actual cast roles when a team exists, otherwise the full catalog (FR-061). Constraining the agent/role fields to castable roles keeps the generated workflow immediately runnable without role-not-found errors at build time.
  5. Few-shot examples — the library workflows, preferring the canonical software-delivery, bug-fix, and agent-evaluation patterns (read from packages/Agentweaver.Squad/Catalog/Resources/workflows/). These demonstrate correct structure, gate routing, and complete verdict branching.
  6. The user's description — fenced as untrusted data (<<<DESCRIPTION>>><<<END_DESCRIPTION>>>) with an instruction to treat it as data, never as instructions to follow (prompt-injection hardening).
  7. Output instruction — "Return ONLY valid YAML for a WorkflowDefinition. No markdown fences. No commentary."

Correction pass (FR-060)

The generator validates the model output against WorkflowDefinitionLoader (the same rules as the runtime). On the first failure it makes exactly one more model call:

<original prompt>

Your previous attempt produced YAML that FAILED validation. Fix it.

PREVIOUS YAML:
<failed yaml>

VALIDATION ERROR:
<the loader's file-scoped error message>

Fix the YAML and return only the corrected YAML.
  • If the corrected output validates → it is returned with wasCorrected = true.
  • If it is still invalid → the generator throws WorkflowGenerationException, which the endpoint maps to 400 { error } naming the unresolved problem rather than surfacing a broken draft. The mechanism never loops or retries indefinitely.

Output cleanup and id generation

  • Markdown fences — despite the "no fences" instruction, models sometimes wrap output in ```yaml. StripFences extracts the fenced content (or strips stray markers) before validation.
  • Missing id — if the model omits a top-level id: (or leaves it blank), the generator derives a kebab-case slug from the description (lowercased, non-alphanumerics collapsed to hyphens, max 40 chars) and injects it, so the draft always carries a stable id.

Blueprint-driven generation: process fit, FR-063 fallback, and bespoke roles

Workflow generation is also reachable indirectly through blueprint generation (BlueprintService.GenerateAsync, behind POST /api/blueprints/generate). The blueprint generator picks library workflows on process fit and falls back to generating a custom workflow only when nothing fits.

  • Library-first, process-fit matching (FR-062). CopilotBlueprintGenerator instructs the model to select library workflows only when the PROCESS they define matches what the team will actually do — never on name similarity or domain-word overlap. For operational/domain-specific work that matches no library workflow's process, the model returns an empty workflows array. An empty array is the correct answer when nothing fits — it is the sentinel for FR-063.
  • Generate-when-none-fits (FR-063). When the model returns no library match, BlueprintService invokes IWorkflowGenerator (the same generator documented above) to produce a custom workflow draft, returned in BlueprintGenerationResult. On ApplyAsync, that YAML is parsed, written to the project's .agentweaver/workflows/ directory, and the registry is synced so the new workflow is immediately coordinator-selectable.
  • Bespoke roles. A generated blueprint may roster roles that have no catalog match. Each such id must also appear in a bespoke_roles array, where every entry carries id, title, and an inline charter (2–4 sentences). BlueprintService.ValidateAsync enforces that every non-catalog roster id has a matching bespoke definition, that bespoke ids don't collide with catalog roles, and that each bespoke role is actually rostered. On apply, bespoke roles are materialized into the casting pipeline so the team is cast with the inline charters. Bespoke roles are a last resort — the generator prefers catalog roles, which ship with pre-built charters and are immediately runnable.

The matching process-fit selection at run time (which library workflow a coordinator picks per task) is documented separately in workflow-selection.md.

Testing

tests/Agentweaver.Tests/Workflows/WorkflowGeneratorTests.cs covers:

  • A valid model response → parsed workflow, wasCorrected = false.
  • Markdown-fenced valid output → cleaned and parsed.
  • An invalid response → correction pass triggered, corrected draft returned with wasCorrected = true.
  • Both passes invalid → WorkflowGenerationException.
  • Missing id → derived from the description slug.
  • The endpoint returns 200 with yaml + workflowId (driven through a stub IWorkflowGenerator), and 400 for a missing description.

Unit tests drive CopilotWorkflowGenerator with a scripted IAgentRunner so the prompt → validate → correction pipeline runs without the live model.