AI Review Service — Design Spec¶

Date: 2026-04-23 Status: Draft — pending review Build order position: Follows the core platform services (auth → user-management → consent → notifications); first of the "everything else" tier. Related: docs/superpowers/specs/2026-04-19-sa-platform-design.md, docs/superpowers/specs/2026-04-15-clinical-data-model-design.md, docs/superpowers/specs/2026-04-22-notifications-phi-at-rest-design.md, docs/superpowers/specs/2026-04-23-kms-key-provider-design.md

1. Scope, Goals, Non-Goals¶

What this service is¶

services/ai-review/ — a standalone NestJS service that orchestrates AI inference against Skin Analytics' internal DERM 5.0.0 API on behalf of clinical-api. Owns the orchestration job lifecycle, the raw DERM response (PHI-encrypted, retained for audit), and per-lesion projections for querying and event fan-out. Writes nothing directly to clinical-api; results reach the clinical data model via domain events.

v1 shape — thin, product-agnostic orchestrator¶

Thin: one inference backend (DERM), one orchestration flow, minimum viable feature surface. Multi-model routing, confidence-based escalation to human-review, per-product credentials, and rate limiting are all explicit non-goals for v1.
Product-agnostic: no product-specific code paths. productId is a request parameter; products are distinguished only by the fields they carry (consent type codes, image processing policy) already owned by clinical-api.

What it does¶

Subscribes to ai_review.requested events emitted by clinical-api.
Bridges S3-stored images to DERM's multipart upload via short-lived presigned URLs.
Orchestrates DERM's multi-step stateful API (login → case → image uploads → process → polled result fetch) as a small BullMQ-driven state machine.
Persists the raw DERM response (encrypted at rest with the service-wide DEK) as the compliance source-of-truth and denormalises per-lesion data for queryable projections.
Emits ai_review.started, ai_review.completed, ai_review.failed, ai_review.superseded domain events.
Exposes a small REST surface for admin retry, cancel, supersede, and audit-read of raw DERM responses.
Defense-in-depth consent check via the consent service before any DERM call.

What it deliberately does not do in v1¶

No multi-model / multi-provider inference routing. One DERM configuration per service instance.
No confidence-based routing to the human-review service (which is not yet built).
No admin UI; REST endpoints exist, UI is a product-side follow-up.
No rate limiting / token bucket on DERM calls — rely on BullMQ worker concurrency.
No DERM-side webhook ingestion — DERM does not offer one; polling only.
No priority queueing.
No per-product DERM credentials.
No historical backfill CLI — ad-hoc via the REST POST /v1/ai-reviews endpoint until a standing need emerges.
No clinical-api migration to KmsKeyProvider — separate follow-up with its own per-request DEK cache design.

2. Architecture Overview¶

Service boundary¶

Concern	Owner
DERM API session, credentials, request/response	ai-review
Raw DERM response (PHI-encrypted), retained for audit	ai-review (source of truth)
`AiReview` job state, lesion projections, job history	ai-review
`skin_finding` + `diagnosis` rows (clinical projection)	clinical-api, written from the `ai_review.completed` event
Image bytes in S3 + presigned URL minting	clinical-api
`ai_analysis` consent check	clinical-api gates at event emission; ai-review re-verifies
Admin UI for retry / raw-response viewing	Out of scope v1 (REST surface exists)

Happy-path flow¶

Clinical-api closes a case. Based on product config (required_consent_type_codes, image_processing_policy_json) and a consent check, decides whether AI review is warranted.
If warranted, clinical-api mints short-lived presigned S3 GET URLs for each image in the case and emits ai_review.requested with the URLs in the payload.
ai-review's event consumer enqueues an ai_review.start BullMQ job keyed on (caseId, productId). If an active AiReview already exists for that pair, the handler no-ops (idempotent).
Worker picks up ai_review.start:
Resolves a DERM session token (cached in Redis; re-logs in on miss or 401).
Creates a DERM case (POST /api5/case).
For each image: streams from the presigned S3 URL into a multipart POST /api5/image request to DERM.
Calls POST /api5/case/{id}/process to kick off inference; stores the returned checksum.
Re-enqueues itself as ai_review.poll_result with an initial delay — the worker is not held waiting.
Worker picks up ai_review.poll_result:
Calls GET /api5/case/{id}/result?checksum=....
If analysisStatus === 'STATUS_PROCESSED': encrypts and persists the raw response, writes per-lesion projection rows, transitions to completed, emits ai_review.completed with a rich per-lesion payload.
Otherwise: increments attempt count, re-enqueues with the next backoff delay; fails with derm_result_timeout once the configured attempts/wall-time budget is exhausted.
Clinical-api consumes ai_review.completed, creates one skin_finding per lesion and one diagnosis per finding with source='ai' and assessedByDermVersion recorded in actor_snapshot.

Key rationale¶

Event-driven trigger (vs direct REST) matches the platform spec's "case closed triggers AI review" intent and keeps clinical-api unaware of ai-review's URL on the hot path. REST POST /v1/ai-reviews exists for admin retry, backfill, and explicit programmatic triggers.
Separate ai_review.requested event (rather than direct case.closed subscription) keeps ai-review ignorant of case-closure semantics; clinical-api owns the "should AI review run?" decision.
ai-review owns the raw DERM blob under the "no shared DBs" principle; clinical-api receives a distilled per-lesion payload via event and never sees the full blob. Admin audit of the raw response is served via GET /v1/ai-reviews/:id/raw behind a restricted scope.
Presigned S3 URLs for image bridging keep clinical-api in control of the PHI access boundary (mints, audit-logs, can enforce consent at mint time) and avoid cross-service IAM sharing or proxying large image bodies through clinical-api.
Single BullMQ job with delayed re-enqueue for polling is the simplest execution model that handles DERM's process → result wait without holding a worker on a setTimeout. Same mechanism notifications already uses for the re-enqueue sweep.

3. DERM API Contract (as integrated)¶

DERM 5.0.0 is a session-based HTTP API. Reference: postman/DERM 5.0.0 API DEMO.postman_collection.json.

Session¶

Call	Shape	Returns
`POST /api5/login`	form-urlencoded: `login`, `password`	auth token in `authorization` header
`POST /api5/logout`	no body; `authorization` header	—

Session tokens are cached in Redis with a configurable TTL (default 30 min). On a 401 from any other call, the cached token is invalidated, one re-login is attempted, and the call is retried once.

Orchestration calls¶

Call	Request	Response fields used
`POST /api5/case`	form-urlencoded: `requestingSystem` (from `DERM_REQUESTING_SYSTEM` config, e.g. `Ozone`)	`caseId` (int)
`POST /api5/image` (repeated per image)	multipart: `caseId`, `image` (binary, streamed), `imageType` ∈ `DERMOSCOPIC` \| `MACROSCOPIC`, optional `deviceManufacturer`, `deviceModel`, `correlationId`	`imageId` (int), `hash`
`POST /api5/case/{caseId}/process`	no body	`checksum` (string)
`GET /api5/case/{caseId}/result?checksum=…`	no body	full result payload (see §4)

Result payload (shape summary)¶

Top-level: case metadata + version/regulatory fields (assessedByDermVersion, dermProductNumber, interfaceNumber, udi, termsOfUse, privacyPolicy, instructionsForUse, medicalLabellingGraphic, informationClassification), analysisStatus, analysisResult, analysisResultStatement, processingType, lesionCount, reportDate, sensitivitySet.

Per image (images[]): imageId, imageHash, imageType, imageQuality (e.g. IMAGE_QUALITY_SUITABLE).

Per lesion (lesions[]): lesionNumber, lesionId, coordinates ({x1,y1,x2,y2} bounding box), classification (DERM-native code such as SKIN_LESION_MELANOMA), category (Malignant/Benign/…), priority, suspectedDiagnosis (free text), suspectedDiagnosisICD10Code/Term, suspectedDiagnosisSnomedCTCode/Term, referralFlag, referralRecommendation, referralRecommendationGuidance, interpretationGuidance.

settings object (model metrics snapshot): priorities map, sensitivity map per classification. Persisted in the raw blob; not individually projected.

SNOMED already provided. DERM returns SNOMED CT codes directly on each lesion. For v1 we pass them through as diagnosis.code_system='SNOMED-CT' without going through diagnosis_code_mapping. The mapping table remains useful for future non-SNOMED models.

4. Data Model¶

All tables in a new ai-review Prisma schema (MySQL 8, same conventions as other services: UUID v7 IDs, tenancy via orgId, soft-tenanted by auth-client middleware).

`AiReview` — orchestration job¶

Field	Type	Notes
`id`	UUID v7 PK
`caseId`	UUID	External ref to clinical-api
`productId`	UUID	External ref to clinical-api
`orgId`	UUID	Tenancy scope
`status`	enum	`pending` \| `in_progress` \| `awaiting_result` \| `completed` \| `failed` \| `cancelled` \| `superseded`
`attemptCount`	int	Poll-attempt counter
`correlationId`	string nullable	Threaded from the triggering event
`dermCaseId`	int nullable	DERM's case identifier
`dermChecksum`	string nullable	Returned from `/process`, required for `/result`
`dermVersion`	string nullable	`assessedByDermVersion` captured on success for querying
`supersededById`	UUID nullable	Points to the new review that replaced this one
`supersededAt`	datetime nullable
`error`	JSON nullable	Structured error (code, message, classification, upstreamStatus)
`errorClassification`	string nullable	One of the taxonomy values in §6
`createdAt`	datetime
`updatedAt`	datetime
`completedAt`	datetime nullable	Set on `completed` / `failed`

Indexes: (orgId, status), (caseId, productId, status). The "at most one non-superseded row per (caseId, productId)" invariant is application-enforced inside a transaction (SELECT … FOR UPDATE) rather than a DB unique constraint, because supersede means multiple rows coexist over time. Completed and failed rows count as holding the slot until explicitly superseded.

`AiReviewImage` — per-image bridging record¶

Field	Type	Notes
`id`	UUID v7 PK
`aiReviewId`	UUID FK	`AiReview.id`
`clinicalImageId`	UUID	Clinical-api's image identifier
`imageType`	string	`DERMOSCOPIC` \| `MACROSCOPIC`
`dermImageId`	int nullable	Populated after successful upload
`dermImageHash`	string nullable	Returned by DERM, retained for traceability
`imageQuality`	string nullable	Populated from `/result`
`createdAt`	datetime

Index: (aiReviewId).

`AiReviewResult` — one per successful review; audit source-of-truth¶

Field	Type	Notes
`id`	UUID v7 PK
`aiReviewId`	UUID FK unique	`AiReview.id`
`analysisResult`	string	DERM `analysisResult` enum
`analysisStatus`	string	DERM `analysisStatus` enum
`lesionCount`	int
`processingType`	string	`DERMO_ONLY` \| `DERMO_MACRO` \| future values
`sensitivitySet`	string nullable	e.g. `"B"`
`reportDate`	datetime	From DERM (`reportDate` is Unix millis in the payload)
`rawCiphertext`	bytes	Full DERM response JSON, AES-256-GCM encrypted with service-wide DEK
`rawIv`	bytes(12)	GCM IV
`rawAuthTag`	bytes(16)	GCM auth tag
`createdAt`	datetime

`AiReviewLesion` — per-lesion projection¶

Field	Type	Notes
`id`	UUID v7 PK
`aiReviewId`	UUID FK
`lesionNumber`	int	1-indexed, from DERM
`dermLesionId`	int	DERM's identifier
`coordinates`	JSON	`{x1,y1,x2,y2}`
`classification`	string	DERM code, e.g. `SKIN_LESION_MELANOMA`
`category`	string	`Malignant` \| `Benign` \| …
`priority`	int	DERM priority value
`suspectedDiagnosisIcd10Code`	string nullable	e.g. `C43.9`
`suspectedDiagnosisIcd10Term`	string nullable
`suspectedDiagnosisSnomedCode`	string nullable	e.g. `93655004`
`suspectedDiagnosisSnomedTerm`	string nullable
`referralFlag`	string nullable	e.g. `Urgent Refer`
`referralRecommendation`	string nullable
`referralRecommendationGuidance`	string nullable
`interpretationGuidance`	text nullable	Long template text
`createdAt`	datetime

Index: (aiReviewId).

PHI classification¶

AiReviewResult.rawCiphertext is the only field encrypted at rest. Under GDPR the full response is personal data (linked to a patient via caseId).
Lesion projection fields are either DERM-native codes, clinical standard codes (ICD-10/SNOMED), or template text keyed on classification. None is patient-specific. They remain plaintext, matching clinical-api's own convention for diagnosis.code_* fields.
AiReview.error is plaintext JSON. If DERM ever echoed patient-identifying data in an error response we would move this to a ciphertext field — no evidence in the sampled API that it does.

DERM session cache¶

Redis key, not a DB table: ai-review:derm:session:<sha256(username)> → { token, expiresAt }. TTL from DERM_SESSION_TTL_SECONDS. Ephemeral state that does not belong in the durable store.

5. Public Contract¶

Domain events¶

All events flow through EventModule from @sa-platform/common (Redis pub/sub today, EventBridge phase-2 platform-wide).

Consumed:

ai_review.requested — emitted by clinical-api when a case closure warrants AI review.

{
  caseId: UUID,
  productId: UUID,
  orgId: UUID,
  correlationId: string,
  requestedAt: ISO8601,
  images: [
    {
      imageId: UUID,
      imageType: "DERMOSCOPIC" | "MACROSCOPIC",
      presignedUrl: string,
      presignedUrlExpiresAt: ISO8601,
      deviceManufacturer?: string,
      deviceModel?: string
    }
  ]
}

Published:

ai_review.started — job picked up, DERM interaction begun. { aiReviewId, caseId, productId, orgId, correlationId, startedAt }.
ai_review.completed — success. Rich payload for clinical-api to build skin_finding + diagnosis rows:

{
  aiReviewId, caseId, productId, orgId, correlationId,
  assessedByDermVersion, reportDate, analysisResult,
  imageQuality: [{ imageId, quality }],
  lesions: [
    {
      lesionNumber, dermLesionId, coordinates,
      classification, category, priority,
      suspectedDiagnosisSnomedCode, suspectedDiagnosisSnomedTerm,
      suspectedDiagnosisIcd10Code, suspectedDiagnosisIcd10Term,
      referralFlag, referralRecommendation,
      referralRecommendationGuidance, interpretationGuidance
    }
  ]
}

ai_review.failed — terminal failure. { aiReviewId, caseId, productId, orgId, correlationId, errorCode, errorClassification, errorMessage, retriable }.
ai_review.superseded — an existing review has been replaced. { oldAiReviewId, newAiReviewId, caseId, productId, orgId, supersededAt, reason? }. Consumers use this to mark the corresponding AI diagnosis rows as superseded.

REST endpoints¶

Base path /v1. JWT validated via @sa-platform/auth-client; scopes enforced with @RequireScopes(). Errors as RFC 7807 Problem+JSON via the shared ProblemJsonFilter.

Method & path	Purpose	Scope
`POST /v1/ai-reviews`	Explicit trigger. Body: `{ caseId, productId, images:[…], supersede?: boolean, correlationId? }`. Returns `202 { aiReviewId, status }` on new; `200 { aiReviewId, status }` if idempotent hit on existing active review.	`ai-review:create`
`GET /v1/ai-reviews/:id`	Fetch review + result projections + lesions (no raw blob).	`ai-review:read`
`GET /v1/ai-reviews`	List with filters `caseId`, `productId`, `status`. Cursor-paginated.	`ai-review:read`
`GET /v1/ai-reviews/:id/raw`	Decrypt and return the raw DERM JSON. Audit only.	`ai-review:read-raw`
`POST /v1/ai-reviews/:id/supersede`	Mark existing as superseded and create replacement review. Body: `{ reason?: string, images: [...] }`. Caller must supply fresh presigned URLs.	`ai-review:supersede`
`POST /v1/ai-reviews/:id/cancel`	Cancel non-terminal review.	`ai-review:cancel`
`GET /health`, `GET /metrics`	Standard. `@Public()`.	—

Scopes to register in the auth service¶

ai-review:create, ai-review:read, ai-review:read-raw, ai-review:supersede, ai-review:cancel. ai-review:read-raw is granted only to admin client identities by policy (separate from routine read).

6. DERM Adapter and Orchestration¶

Client interface¶

interface DermClient {
  createCase(): Promise<{ dermCaseId: number }>

  uploadImage(args: {
    dermCaseId: number,
    imageStream: Readable,
    contentLength: number,
    imageType: 'DERMOSCOPIC' | 'MACROSCOPIC',
    deviceManufacturer?: string,
    deviceModel?: string,
    correlationId: string,
  }): Promise<{ dermImageId: number, hash: string }>

  process(dermCaseId: number): Promise<{ checksum: string }>

  fetchResult(dermCaseId: number, checksum: string):
    Promise<{ status: 'processed' | 'not_ready', body?: DermResultPayload }>
}

Implementation uses undici for HTTP and native FormData for multipart. Image bytes stream from the S3 presigned URL directly into the DERM multipart request without whole-image buffering.

Job orchestration¶

Two BullMQ job types backed by the existing @sa-platform/common Redis integration.

ai_review.start

Load AiReview; confirm pending.
Re-check consent (see §7).
Resolve DERM session via DermSessionService (Redis-cached token).
createCase() → persist dermCaseId, transition pending → in_progress.
For each AiReviewImage:
Open a streaming S3 GET to the presigned URL.
uploadImage() → persist dermImageId, dermImageHash.
process() → persist dermChecksum, transition in_progress → awaiting_result.
Enqueue ai_review.poll_result with DERM_RESULT_INITIAL_DELAY_MS (default 5000).

ai_review.poll_result

Load AiReview; confirm awaiting_result and not cancelled.
fetchResult(dermCaseId, dermChecksum).
If analysisStatus === 'STATUS_PROCESSED':
Encrypt raw JSON with service-wide DEK → insert AiReviewResult.
Insert AiReviewLesion rows, one per lesion in the response.
Update each AiReviewImage.imageQuality from images[].
Transition awaiting_result → completed.
Emit ai_review.completed.
Else (not ready):
Increment attemptCount.
If attemptCount > DERM_RESULT_MAX_ATTEMPTS (default 20) or elapsed wall time since awaiting_result transition exceeds DERM_RESULT_TIMEOUT_MS (default 600_000) → fail with derm_result_timeout, emit ai_review.failed.
Otherwise re-enqueue with the next delay from the configured backoff schedule (default: [5s, 15s, 30s, 60s, 120s, 300s, 300s, …]).

Error taxonomy¶

Condition	`errorClassification`	Retriable?
401 on DERM call (first)	`derm_auth_transient`	Retry once after session invalidation + re-login
401 after re-login	`derm_auth_failed`	No
5xx / network timeout during `start`	`derm_upstream_transient`	Yes, BullMQ job retry (exponential, max 3)
4xx non-401 during `start`	`derm_client_error`	No
`/result` never `STATUS_PROCESSED` within attempt or time budget	`derm_result_timeout`	No
`analysisResult !== PROCESSING_ANSWER_TYPE_SUCCESSFUL_PROCESSING`	`derm_processing_failed`	No — record outcome, emit `ai_review.failed`
`imageQuality === IMAGE_QUALITY_UNSUITABLE`	—	Not an error; emit `ai_review.completed` with quality flags
S3 presigned URL expired / 403 on fetch	`image_fetch_failed`	No — admin must re-trigger with fresh URLs
Consent withdrawn between emission and pickup	`consent_missing`	No — fail before any DERM call

Two layers of defense:

Clinical-api, primary gate. Before emitting ai_review.requested, clinical-api checks that the case's product lists ai_analysis in required_consent_type_codes and the patient has a granted record for that consent type. No event emitted if the check fails. Uses the existing clinical-api ↔ consent-service integration pattern.
ai-review, re-check at job pickup. When the ai_review.start worker picks up a job, it calls the consent service (GET /v1/consent-records?patientId=…&consentTypeCode=ai_analysis) to confirm granted status is still current. If the patient withdrew consent between clinical-api's check and ai-review's pickup, the job terminates with errorClassification='consent_missing'; no DERM call is made.

The consent type code used (ai_analysis — provisional name) is added to the consent service's seeded consent types if not already present. Implementation plan captures this as a seed-data task.

8. Idempotency and Supersede¶

Active-row invariant¶

"At most one non-superseded AiReview per (caseId, productId)." Completed and failed rows count as holding the slot until an explicit supersede. Enforced in application code:

BEGIN;
SELECT id, status FROM AiReview
 WHERE caseId = ? AND productId = ? AND status <> 'superseded'
 FOR UPDATE;
-- if a non-superseded row exists and supersede=false:
--   return that row, COMMIT, do not enqueue.
-- otherwise:
--   INSERT new row, COMMIT, enqueue.

Both the ai_review.requested event handler and POST /v1/ai-reviews route go through the same creation path, so duplicate triggers are absorbed identically.

Supersede flow¶

POST /v1/ai-reviews/:id/supersede (or POST /v1/ai-reviews with supersede=true):

Transactionally:
Mark the existing active row as superseded, set supersededAt.
Insert a new AiReview row with a fresh id, pointing the old row's supersededById at the new row.
Enqueue ai_review.start for the new row.
Emit ai_review.superseded { oldAiReviewId, newAiReviewId, caseId, productId, orgId, supersededAt, reason? }.

Clinical-api consumes ai_review.superseded and marks the corresponding AI diagnosis rows as superseded — requires a small clinical-api schema addition (new diagnosis.status enum value superseded with a superseded_by_diagnosis_id nullable self-reference). Captured in the implementation plan.

Cancel¶

POST /v1/ai-reviews/:id/cancel is permitted only on pending, in_progress, awaiting_result. Marks the row cancelled; the next job pickup short-circuits. No cancel call is made to DERM (the API does not appear to support one; DERM's case record lingers on their side).

9. Security¶

Transport and auth¶

TLS on all inbound and outbound traffic.
Inbound: @sa-platform/auth-client JWT validation, scope enforcement.
Outbound to consent service: service-to-service JWT with appropriate scopes.
Outbound to DERM: credentials via POST /api5/login; session token applied as authorization header on subsequent calls.

Secrets¶

Enforced by the prod-secret pattern from PR #26. Dev-defaults rejected in production:

DERM_USERNAME, DERM_PASSWORD
AI_REVIEW_WRAPPED_DEK (service-wide DEK, wrapped by the configured key provider)
AI_REVIEW_KMS_KEY_ID, AWS_REGION (when AI_REVIEW_KEY_PROVIDER=kms)

Key provider¶

Adopts the KeyProvider abstraction from PR #27. AI_REVIEW_KEY_PROVIDER=local|kms (default local for dev); kms path uses KmsKeyProvider out of the box from @sa-platform/common. No new crypto code; mirrors the notifications service's adoption.

PHI¶

AiReviewResult.rawCiphertext is the sole PHI-at-rest field, encrypted with AES-256-GCM under the service-wide DEK. Same pattern as notifications' PHI fields.
Lesion projections, image projections, and AiReview metadata are plaintext as justified in §4.
Images flow from clinical-api S3 → ai-review (transit) → DERM (transit). Ai-review does not persist image bytes; the multipart request body is streamed and discarded.
DERM is a Skin Analytics internal product. The operator-facing implementation plan should confirm with compliance that a Data Processing Agreement covers the data flow; this spec assumes DERM is within scope of the platform's DPA.

Audit¶

Every AiReview state transition, every DERM API call (endpoint, status, latency, correlation id), and every REST request is logged via AuditModule with actor snapshot.
S3 object-locked audit archive follows the clinical-api pattern already documented in 2026-04-15-clinical-data-model-design.md.

10. Observability¶

Structured logs¶

Per-event fields: correlationId, aiReviewId, caseId, productId, orgId, status, phase (start | poll_result), dermEndpoint, dermLatencyMs, dermHttpStatus, attempt, errorClassification.

Metrics (Prometheus)¶

Metric	Type	Purpose
`ai_review_jobs_total{status,errorClass}`	counter	Throughput and failure mix
`ai_review_duration_seconds{phase}`	histogram	End-to-end latency of `start` and `poll_result`
`derm_api_calls_total{endpoint,httpStatus}`	counter	DERM call volume and status distribution
`derm_api_latency_seconds{endpoint}`	histogram	DERM latency
`ai_review_result_attempts`	histogram	Polling attempts until processed — tunes backoff
`ai_review_queue_depth`	gauge	BullMQ queue depth

Alerting (out of spec, in operator runbook)¶

ai_review_jobs_total{status="failed"} rate spike
derm_api_calls_total{httpStatus=~"5.."} sustained non-zero rate
Oldest awaiting_result row age > threshold
Queue depth growth unaccompanied by completion throughput

11. Configuration¶

Env var	Default	Purpose
`DERM_API_URL`	—	DERM base URL (e.g. `https://cc.derm.skin-analytics.com`)
`DERM_USERNAME`	—	Login username (prod-enforced)
`DERM_PASSWORD`	—	Login password (prod-enforced)
`DERM_REQUESTING_SYSTEM`	`Ozone`	Sent on case creation
`DERM_SESSION_TTL_SECONDS`	`1800`	Cached session TTL in Redis
`DERM_RESULT_INITIAL_DELAY_MS`	`5000`	Delay before first `/result` poll
`DERM_RESULT_MAX_ATTEMPTS`	`20`	Max `/result` polls before timeout
`DERM_RESULT_TIMEOUT_MS`	`600000`	Wall-time budget from first poll
`DERM_RESULT_BACKOFF_SCHEDULE_MS`	`5000,15000,30000,60000,120000,300000`	Comma-separated backoff sequence; last value repeats
`AI_REVIEW_KEY_PROVIDER`	`local`	`local` \| `kms`
`AI_REVIEW_WRAPPED_DEK`	—	Wrapped service-wide DEK (prod-enforced)
`AI_REVIEW_KMS_KEY_ID`	—	Required when `AI_REVIEW_KEY_PROVIDER=kms`
`AWS_REGION`	`eu-west-2`	For KMS
`CONSENT_SERVICE_URL`	—	For runtime consent re-check
`BULL_REDIS_URL`	—	Shared `@sa-platform/common` Redis config

12. Testing Strategy¶

Mirrors the notifications service's split. No live DERM calls in CI; live-DERM smoke is manual at deploy time against a DERM staging environment.

Unit tests (Jest)¶

DermClient adapter with mocked undici: status-code branching, session-refresh on 401, multipart construction, streaming upload, error classification.
DermSessionService: Redis cache behaviour, TTL, invalidation on 401.
AiReviewService (orchestration): state transitions, idempotency (active-row invariant), supersede flow, cancel flow, retry classification, backoff math for poll_result.
Consent re-check: mocked consent-service client, blocked DERM call when revoked.
Event publishers: payload shape conformance.
Event consumer (ai_review.requested): triggers orchestration, idempotent on duplicate deliveries.

Integration tests (Jest + testcontainers or equivalent local compose)¶

Real MySQL (tenant-scoped), real Redis, mocked DERM via msw or nock.
End-to-end:
ai_review.requested event → start → uploads → process → result (processed) → ai_review.completed emitted with expected payload.
Consent withdrawal path → ai_review.failed with consent_missing; no DERM interaction.
401 refresh path → session re-login, call retried, success.
Result-timeout path → ai_review.failed with derm_result_timeout after budget exhaustion.
Supersede flow → old row superseded, new row processes, ai_review.superseded emitted.
Cancel flow → in-flight row terminates on next worker pickup.
Idempotency: duplicate ai_review.requested events yield one review.
Image stream bridging: mocked S3 presigned URL → multipart body assembled correctly.

Manual smoke (deploy time)¶

A small operator script exercises POST /v1/ai-reviews against DERM staging, using a fixed validation image pair. Run per environment post-deploy to confirm live connectivity.

13. Operator Notes¶

Enabling ai-review in an environment¶

Provision DERM credentials in the target environment's secret store; populate DERM_* env vars.
Provision the service-wide DEK — generate 32 random bytes, wrap under the chosen provider (LocalKeyProvider or KmsKeyProvider), populate AI_REVIEW_WRAPPED_DEK and key-provider env vars.
Register the ai-review:* scopes in the auth service and grant to the appropriate API client identities.
Ensure the ai_analysis consent type exists in the consent service (seed on first deploy).
Deploy the service, run manual smoke, confirm metrics/logs flowing.

Forward-only PHI stance (inherited)¶

Consistent with PRs #24, #27: there is no migration of pre-existing rows from a different crypto configuration. The service begins storing data under whatever key provider is configured at first deploy.

14. Out-of-Spec Follow-ups¶

Clinical-api → KmsKeyProvider migration — separate follow-up requiring per-request DEK cache design.
Admin UI — list, inspect, retry, view raw DERM response. Product-side.
Human-review service — the pair to this service; confidence-based escalation hooks remain TODO until human-review is built.
DERM-side webhooks — should DERM publish a push-completion webhook in future, replace polling with the inbound endpoint.
Per-product DERM configuration / multi-provider routing — when a second inference backend lands.
Rate limiting / token bucket — if DERM publishes hard quotas.
Historical backfill CLI — if backfilling becomes a recurring need.
FHIR translation — diagnosis rows written from ai_review.completed are already modelled FHIR-compatible; the translation layer itself is deferred platform-wide.