Skip to content

orchestrator

Purpose

orchestrator drives the platform's configurable case-processing workflow. When a case.created event arrives it looks up the active WorkflowDefinition for the org and product, creates a WorkflowInstance, and steps through it by dispatching module-specific events (consent.check.requested, ai_review.requested, human_review.requested, etc.) and advancing on the corresponding completion events. It owns workflow definition authoring, instance lifecycle management, and deadline scheduling via BullMQ.

Case-creation gate. clinical-api refuses POST /v1/clinical-api/cases with 422 missing_workflow_definition when no active WorkflowDefinition exists for the caller's (org, product). See runbooks/workflow-assignment.md for the operator playbook.

Key endpoints

POST /workflow-definitions — create a workflow definition (draft) GET /workflow-definitions — list definitions GET /workflow-definitions/:id — fetch a definition PATCH /workflow-definitions/:id — update a draft definition POST /workflow-definitions/:id/publish — publish a draft (makes it active for new instances) POST /workflow-definitions/:id/archive — archive an active definition POST /workflow-definitions/:id/clone — clone a definition to a new draft POST /workflow-definitions/:id/validate — validate a definition without publishing GET /workflow-definitions/active?org_id=…&product_id=… — service-to-service check used by clinical-api before creating a case (returns { hasActive, definitionId? }); guarded by SERVICE_AUTH_TOKEN bearer GET /workflow-definitions/resolve?product_id=…[&org_id=…] — consumer-facing variant. Returns the full active WorkflowDefinition row for the (org, product) pair using the org-specific → product-wide-default fallback, or 404. Tenant clients have org_id pinned from their JWT; cross-tenant admins must pass it. Scope: orchestrator:read.

POST and PATCH /workflow-definitions/:id both accept an isDefault: boolean flag. When true the workflow's orgId is forced to null (product-wide default — applies to every org with that product until they author their own override); only cross-tenant admins may set it, and PATCH is refused with 409 if another active default already exists for the product. The fallback resolver findActiveFor(orgId, productId) prefers an org-specific active row, then drops to (orgId IS NULL) for the same product.

PATCH /workflow-definitions/:id also accepts optional orgId + productId for reassigning a draft workflow to a different slot. Cross-tenant admins can target any org; tenant clients are pinned to their own. Reassignment renumbers version to the next free value under the new (orgId, productId) so the unique constraint doesn't bite. The :id segment is constrained to UUID-shape ([0-9a-fA-F-]{36}) so static siblings like /active and /resolve aren't swallowed. GET /workflow-instances — list instances (filterable by case / org / status) GET /workflow-instances/:id — fetch an instance GET /workflow-instances/:id/steps — list steps for an instance GET /workflow-instances/:id/events — list events recorded for an instance GET /workflow-instances/:id/next-step — inspect the pending next step POST /workflow-instances/:id/retry-step — retry the current failed step POST /workflow-instances/:id/cancel — cancel a running instance POST /workflow-instances/:id/supersede — replace an instance with a new one POST /workflow-instances/:id/halt — pause an instance (manual intervention) POST /workflow-instances/:id/resume — resume a halted instance GET /reason-codes — list operator-defined cancellation / halt reason codes POST /reason-codes — create a reason code GET /workflow-templates — list workflow definition templates GET /workflow-templates/:id — fetch a template

Question sets

Question sets are versioned, publishable, per-(org, product) reusable question banks referenced by collect_questions steps. Like workflows, they support a default mode (orgId IS NULL) so a single set can apply to every org assigned to a product.

Localized content (question sets + workflow definitions) {#i18n}

The platform UI and APIs themselves stay English; configurable content can be authored in multiple languages. Translatable surfaces:

Question sets

  • QuestionSet.name
  • QuestionSet.description
  • per-question prompt
  • single_choice / multi_choice option label

Workflow definitions

  • WorkflowDefinition.name
  • WorkflowDefinition.description

Wire shape — LocalizedString. Each translatable field accepts either a plain string (treated as English) or:

{
  "default": "Is the lesion visible?",
  "translations": { "fr": "La lésion est-elle visible ?", "de": "Ist die Läsion sichtbar?" }
}

The top-level name / description columns stay English-only on the row; their translations live on sidecar JSON columns nameTranslations / descriptionTranslations (same key shape — { "fr": "...", ... }). Per-question prompt and option label carry the LocalizedString inline inside the questions JSON column.

Supported locales (hardcoded platform-wide). en, fr, de, es, pt-BR, it, nl, et. Any locale outside this list is rejected at the API boundary with 400 — both on write (translation keys) and on read (?lang= flag). The list is defined in packages/common/src/i18n.ts and consumed by every service that needs i18n; add new locales by editing that file (plus the mirror in apps/admin-ui/src/api/localized-string.ts so the editor's per-locale rows render correctly).

Read API — ?lang=<locale> flag. Every read endpoint that surfaces translatable content accepts an optional lang query param:

  • Question sets: GET /question-sets, GET /question-sets/:id
  • Workflow definitions: GET /workflow-definitions, GET /workflow-definitions/:id, GET /workflow-definitions/resolve
  • Parked steps: GET /workflow-instances/:id/current-step (resolves question content when parked on collect_questions)
  • Products (clinical-api): GET /v1/clinical-api/admin/products, GET /v1/clinical-api/admin/products/:id (see clinical-api.md)

  • omitted — returns the full LocalizedString objects so the admin UI editor can render every variant.

  • ?lang=fr — every LocalizedString-shaped node in the response is flattened to its locale variant. Missing translations fall back to default. name and description resolve via their sidecar maps the same way.
  • unsupported locale — 400 with the supported-list in the error message; consumers never silently receive English when they asked for an unsupported locale.

Workflow snapshots preserve all locales. When the orchestrator parks an instance on a collect_questions step it snapshots the question set's questions JSON verbatim — LocalizedString objects intact. That means a parked instance can be rendered in any supported locale later; the choice is made at read time via ?lang=, not at park time.

Back-compat. Existing question sets that store plain strings continue to work — the schema accepts both shapes, and resolve(value, lang) treats a plain string as the default.

Image collection step shape

collect_images accepts configurations: ['derm_only' | 'derm_plus_macro' | 'macro_only'][] plus two optional count blocks:

  • dermoscopic_count?: { min: ≥1, max: ≥min, optional?: boolean }
  • macroscopic_count?: { min: ≥2, max: ≥min, optional?: boolean }

optional: true means the client may submit zero of that type; if any are provided the count must fall in [min, max]. The orchestrator validator enforces the cross-field bounds.

Database tables

WorkflowDefinition — versioned workflow definition (step graph, mode, timeout) WorkflowDefinitionRevision — revision history for definition edits WorkflowInstance — running instance for a specific case, snapshotting the definition at creation WorkflowStep — individual step records within an instance (status, correlation ID, context snapshot) WorkflowEvent — event inbox for all consumed events (used for idempotency and audit) WorkflowIntervention — manual halt / resume records ReasonCode — operator-defined codes attached to cancellations and interventions

Events

case.created — consume — triggers workflow instantiation for the case consent.check.requested — emit — dispatched for gate_on_consent steps consent.check.completed — consume — advances the instance on consent check resolution ai_review.requested — emit — dispatched for run_ai_review steps ai_review.completed — consume — advances the instance when AI review completes ai_review.failed — consume — marks step failed; triggers on_failure transition human_review.requested — emit — dispatched for request_human_review steps human_review.completed — consume — advances the instance when a human review is submitted human_review.failed — consume — marks step failed; triggers on_failure transition case.workflow.completed — emit — published when a terminal emit_final step is reached

Dependencies

  • Redis — event transport via @sa-platform/events Redis Streams (all consumed and emitted events)
  • BullMQ — deadline / timeout scheduling for workflow instances
  • MySQL — primary store, accessed via Prisma 7 driver-adapter pattern
  • auth — JWT verification via @sa-platform/auth-client

Where to learn more

  • Design spec
  • Source: services/orchestrator/ (in this repo)