Skip to content

AI Review Service — Design Spec

Date: 2026-04-23 Status: Draft — pending review Build order position: Follows the core platform services (auth → user-management → consent → notifications); first of the "everything else" tier. Related: docs/superpowers/specs/2026-04-19-sa-platform-design.md, docs/superpowers/specs/2026-04-15-clinical-data-model-design.md, docs/superpowers/specs/2026-04-22-notifications-phi-at-rest-design.md, docs/superpowers/specs/2026-04-23-kms-key-provider-design.md


1. Scope, Goals, Non-Goals

What this service is

services/ai-review/ — a standalone NestJS service that orchestrates AI inference against Skin Analytics' internal DERM 5.0.0 API on behalf of clinical-api. Owns the orchestration job lifecycle, the raw DERM response (PHI-encrypted, retained for audit), and per-lesion projections for querying and event fan-out. Writes nothing directly to clinical-api; results reach the clinical data model via domain events.

v1 shape — thin, product-agnostic orchestrator

  • Thin: one inference backend (DERM), one orchestration flow, minimum viable feature surface. Multi-model routing, confidence-based escalation to human-review, per-product credentials, and rate limiting are all explicit non-goals for v1.
  • Product-agnostic: no product-specific code paths. productId is a request parameter; products are distinguished only by the fields they carry (consent type codes, image processing policy) already owned by clinical-api.

What it does

  • Subscribes to ai_review.requested events emitted by clinical-api.
  • Bridges S3-stored images to DERM's multipart upload via short-lived presigned URLs.
  • Orchestrates DERM's multi-step stateful API (login → case → image uploads → process → polled result fetch) as a small BullMQ-driven state machine.
  • Persists the raw DERM response (encrypted at rest with the service-wide DEK) as the compliance source-of-truth and denormalises per-lesion data for queryable projections.
  • Emits ai_review.started, ai_review.completed, ai_review.failed, ai_review.superseded domain events.
  • Exposes a small REST surface for admin retry, cancel, supersede, and audit-read of raw DERM responses.
  • Defense-in-depth consent check via the consent service before any DERM call.

What it deliberately does not do in v1

  • No multi-model / multi-provider inference routing. One DERM configuration per service instance.
  • No confidence-based routing to the human-review service (which is not yet built).
  • No admin UI; REST endpoints exist, UI is a product-side follow-up.
  • No rate limiting / token bucket on DERM calls — rely on BullMQ worker concurrency.
  • No DERM-side webhook ingestion — DERM does not offer one; polling only.
  • No priority queueing.
  • No per-product DERM credentials.
  • No historical backfill CLI — ad-hoc via the REST POST /v1/ai-reviews endpoint until a standing need emerges.
  • No clinical-api migration to KmsKeyProvider — separate follow-up with its own per-request DEK cache design.

2. Architecture Overview

Service boundary

Concern Owner
DERM API session, credentials, request/response ai-review
Raw DERM response (PHI-encrypted), retained for audit ai-review (source of truth)
AiReview job state, lesion projections, job history ai-review
skin_finding + diagnosis rows (clinical projection) clinical-api, written from the ai_review.completed event
Image bytes in S3 + presigned URL minting clinical-api
ai_analysis consent check clinical-api gates at event emission; ai-review re-verifies
Admin UI for retry / raw-response viewing Out of scope v1 (REST surface exists)

Happy-path flow

  1. Clinical-api closes a case. Based on product config (required_consent_type_codes, image_processing_policy_json) and a consent check, decides whether AI review is warranted.
  2. If warranted, clinical-api mints short-lived presigned S3 GET URLs for each image in the case and emits ai_review.requested with the URLs in the payload.
  3. ai-review's event consumer enqueues an ai_review.start BullMQ job keyed on (caseId, productId). If an active AiReview already exists for that pair, the handler no-ops (idempotent).
  4. Worker picks up ai_review.start:
  5. Resolves a DERM session token (cached in Redis; re-logs in on miss or 401).
  6. Creates a DERM case (POST /api5/case).
  7. For each image: streams from the presigned S3 URL into a multipart POST /api5/image request to DERM.
  8. Calls POST /api5/case/{id}/process to kick off inference; stores the returned checksum.
  9. Re-enqueues itself as ai_review.poll_result with an initial delay — the worker is not held waiting.
  10. Worker picks up ai_review.poll_result:
  11. Calls GET /api5/case/{id}/result?checksum=....
  12. If analysisStatus === 'STATUS_PROCESSED': encrypts and persists the raw response, writes per-lesion projection rows, transitions to completed, emits ai_review.completed with a rich per-lesion payload.
  13. Otherwise: increments attempt count, re-enqueues with the next backoff delay; fails with derm_result_timeout once the configured attempts/wall-time budget is exhausted.
  14. Clinical-api consumes ai_review.completed, creates one skin_finding per lesion and one diagnosis per finding with source='ai' and assessedByDermVersion recorded in actor_snapshot.

Key rationale

  • Event-driven trigger (vs direct REST) matches the platform spec's "case closed triggers AI review" intent and keeps clinical-api unaware of ai-review's URL on the hot path. REST POST /v1/ai-reviews exists for admin retry, backfill, and explicit programmatic triggers.
  • Separate ai_review.requested event (rather than direct case.closed subscription) keeps ai-review ignorant of case-closure semantics; clinical-api owns the "should AI review run?" decision.
  • ai-review owns the raw DERM blob under the "no shared DBs" principle; clinical-api receives a distilled per-lesion payload via event and never sees the full blob. Admin audit of the raw response is served via GET /v1/ai-reviews/:id/raw behind a restricted scope.
  • Presigned S3 URLs for image bridging keep clinical-api in control of the PHI access boundary (mints, audit-logs, can enforce consent at mint time) and avoid cross-service IAM sharing or proxying large image bodies through clinical-api.
  • Single BullMQ job with delayed re-enqueue for polling is the simplest execution model that handles DERM's process → result wait without holding a worker on a setTimeout. Same mechanism notifications already uses for the re-enqueue sweep.

3. DERM API Contract (as integrated)

DERM 5.0.0 is a session-based HTTP API. Reference: postman/DERM 5.0.0 API DEMO.postman_collection.json.

Session

Call Shape Returns
POST /api5/login form-urlencoded: login, password auth token in authorization header
POST /api5/logout no body; authorization header

Session tokens are cached in Redis with a configurable TTL (default 30 min). On a 401 from any other call, the cached token is invalidated, one re-login is attempted, and the call is retried once.

Orchestration calls

Call Request Response fields used
POST /api5/case form-urlencoded: requestingSystem (from DERM_REQUESTING_SYSTEM config, e.g. Ozone) caseId (int)
POST /api5/image (repeated per image) multipart: caseId, image (binary, streamed), imageTypeDERMOSCOPIC | MACROSCOPIC, optional deviceManufacturer, deviceModel, correlationId imageId (int), hash
POST /api5/case/{caseId}/process no body checksum (string)
GET /api5/case/{caseId}/result?checksum=… no body full result payload (see §4)

Result payload (shape summary)

Top-level: case metadata + version/regulatory fields (assessedByDermVersion, dermProductNumber, interfaceNumber, udi, termsOfUse, privacyPolicy, instructionsForUse, medicalLabellingGraphic, informationClassification), analysisStatus, analysisResult, analysisResultStatement, processingType, lesionCount, reportDate, sensitivitySet.

Per image (images[]): imageId, imageHash, imageType, imageQuality (e.g. IMAGE_QUALITY_SUITABLE).

Per lesion (lesions[]): lesionNumber, lesionId, coordinates ({x1,y1,x2,y2} bounding box), classification (DERM-native code such as SKIN_LESION_MELANOMA), category (Malignant/Benign/…), priority, suspectedDiagnosis (free text), suspectedDiagnosisICD10Code/Term, suspectedDiagnosisSnomedCTCode/Term, referralFlag, referralRecommendation, referralRecommendationGuidance, interpretationGuidance.

settings object (model metrics snapshot): priorities map, sensitivity map per classification. Persisted in the raw blob; not individually projected.

SNOMED already provided. DERM returns SNOMED CT codes directly on each lesion. For v1 we pass them through as diagnosis.code_system='SNOMED-CT' without going through diagnosis_code_mapping. The mapping table remains useful for future non-SNOMED models.


4. Data Model

All tables in a new ai-review Prisma schema (MySQL 8, same conventions as other services: UUID v7 IDs, tenancy via orgId, soft-tenanted by auth-client middleware).

AiReview — orchestration job

Field Type Notes
id UUID v7 PK
caseId UUID External ref to clinical-api
productId UUID External ref to clinical-api
orgId UUID Tenancy scope
status enum pending | in_progress | awaiting_result | completed | failed | cancelled | superseded
attemptCount int Poll-attempt counter
correlationId string nullable Threaded from the triggering event
dermCaseId int nullable DERM's case identifier
dermChecksum string nullable Returned from /process, required for /result
dermVersion string nullable assessedByDermVersion captured on success for querying
supersededById UUID nullable Points to the new review that replaced this one
supersededAt datetime nullable
error JSON nullable Structured error (code, message, classification, upstreamStatus)
errorClassification string nullable One of the taxonomy values in §6
createdAt datetime
updatedAt datetime
completedAt datetime nullable Set on completed / failed

Indexes: (orgId, status), (caseId, productId, status). The "at most one non-superseded row per (caseId, productId)" invariant is application-enforced inside a transaction (SELECT … FOR UPDATE) rather than a DB unique constraint, because supersede means multiple rows coexist over time. Completed and failed rows count as holding the slot until explicitly superseded.

AiReviewImage — per-image bridging record

Field Type Notes
id UUID v7 PK
aiReviewId UUID FK AiReview.id
clinicalImageId UUID Clinical-api's image identifier
imageType string DERMOSCOPIC | MACROSCOPIC
dermImageId int nullable Populated after successful upload
dermImageHash string nullable Returned by DERM, retained for traceability
imageQuality string nullable Populated from /result
createdAt datetime

Index: (aiReviewId).

AiReviewResult — one per successful review; audit source-of-truth

Field Type Notes
id UUID v7 PK
aiReviewId UUID FK unique AiReview.id
analysisResult string DERM analysisResult enum
analysisStatus string DERM analysisStatus enum
lesionCount int
processingType string DERMO_ONLY | DERMO_MACRO | future values
sensitivitySet string nullable e.g. "B"
reportDate datetime From DERM (reportDate is Unix millis in the payload)
rawCiphertext bytes Full DERM response JSON, AES-256-GCM encrypted with service-wide DEK
rawIv bytes(12) GCM IV
rawAuthTag bytes(16) GCM auth tag
createdAt datetime

AiReviewLesion — per-lesion projection

Field Type Notes
id UUID v7 PK
aiReviewId UUID FK
lesionNumber int 1-indexed, from DERM
dermLesionId int DERM's identifier
coordinates JSON {x1,y1,x2,y2}
classification string DERM code, e.g. SKIN_LESION_MELANOMA
category string Malignant | Benign | …
priority int DERM priority value
suspectedDiagnosisIcd10Code string nullable e.g. C43.9
suspectedDiagnosisIcd10Term string nullable
suspectedDiagnosisSnomedCode string nullable e.g. 93655004
suspectedDiagnosisSnomedTerm string nullable
referralFlag string nullable e.g. Urgent Refer
referralRecommendation string nullable
referralRecommendationGuidance string nullable
interpretationGuidance text nullable Long template text
createdAt datetime

Index: (aiReviewId).

PHI classification

  • AiReviewResult.rawCiphertext is the only field encrypted at rest. Under GDPR the full response is personal data (linked to a patient via caseId).
  • Lesion projection fields are either DERM-native codes, clinical standard codes (ICD-10/SNOMED), or template text keyed on classification. None is patient-specific. They remain plaintext, matching clinical-api's own convention for diagnosis.code_* fields.
  • AiReview.error is plaintext JSON. If DERM ever echoed patient-identifying data in an error response we would move this to a ciphertext field — no evidence in the sampled API that it does.

DERM session cache

Redis key, not a DB table: ai-review:derm:session:<sha256(username)>{ token, expiresAt }. TTL from DERM_SESSION_TTL_SECONDS. Ephemeral state that does not belong in the durable store.


5. Public Contract

Domain events

All events flow through EventModule from @sa-platform/common (Redis pub/sub today, EventBridge phase-2 platform-wide).

Consumed:

  • ai_review.requested — emitted by clinical-api when a case closure warrants AI review.
{
  caseId: UUID,
  productId: UUID,
  orgId: UUID,
  correlationId: string,
  requestedAt: ISO8601,
  images: [
    {
      imageId: UUID,
      imageType: "DERMOSCOPIC" | "MACROSCOPIC",
      presignedUrl: string,
      presignedUrlExpiresAt: ISO8601,
      deviceManufacturer?: string,
      deviceModel?: string
    }
  ]
}

Published:

  • ai_review.started — job picked up, DERM interaction begun. { aiReviewId, caseId, productId, orgId, correlationId, startedAt }.
  • ai_review.completed — success. Rich payload for clinical-api to build skin_finding + diagnosis rows:
{
  aiReviewId, caseId, productId, orgId, correlationId,
  assessedByDermVersion, reportDate, analysisResult,
  imageQuality: [{ imageId, quality }],
  lesions: [
    {
      lesionNumber, dermLesionId, coordinates,
      classification, category, priority,
      suspectedDiagnosisSnomedCode, suspectedDiagnosisSnomedTerm,
      suspectedDiagnosisIcd10Code, suspectedDiagnosisIcd10Term,
      referralFlag, referralRecommendation,
      referralRecommendationGuidance, interpretationGuidance
    }
  ]
}
  • ai_review.failed — terminal failure. { aiReviewId, caseId, productId, orgId, correlationId, errorCode, errorClassification, errorMessage, retriable }.
  • ai_review.superseded — an existing review has been replaced. { oldAiReviewId, newAiReviewId, caseId, productId, orgId, supersededAt, reason? }. Consumers use this to mark the corresponding AI diagnosis rows as superseded.

REST endpoints

Base path /v1. JWT validated via @sa-platform/auth-client; scopes enforced with @RequireScopes(). Errors as RFC 7807 Problem+JSON via the shared ProblemJsonFilter.

Method & path Purpose Scope
POST /v1/ai-reviews Explicit trigger. Body: { caseId, productId, images:[…], supersede?: boolean, correlationId? }. Returns 202 { aiReviewId, status } on new; 200 { aiReviewId, status } if idempotent hit on existing active review. ai-review:create
GET /v1/ai-reviews/:id Fetch review + result projections + lesions (no raw blob). ai-review:read
GET /v1/ai-reviews List with filters caseId, productId, status. Cursor-paginated. ai-review:read
GET /v1/ai-reviews/:id/raw Decrypt and return the raw DERM JSON. Audit only. ai-review:read-raw
POST /v1/ai-reviews/:id/supersede Mark existing as superseded and create replacement review. Body: { reason?: string, images: [...] }. Caller must supply fresh presigned URLs. ai-review:supersede
POST /v1/ai-reviews/:id/cancel Cancel non-terminal review. ai-review:cancel
GET /health, GET /metrics Standard. @Public().

Scopes to register in the auth service

ai-review:create, ai-review:read, ai-review:read-raw, ai-review:supersede, ai-review:cancel. ai-review:read-raw is granted only to admin client identities by policy (separate from routine read).


6. DERM Adapter and Orchestration

Client interface

interface DermClient {
  createCase(): Promise<{ dermCaseId: number }>

  uploadImage(args: {
    dermCaseId: number,
    imageStream: Readable,
    contentLength: number,
    imageType: 'DERMOSCOPIC' | 'MACROSCOPIC',
    deviceManufacturer?: string,
    deviceModel?: string,
    correlationId: string,
  }): Promise<{ dermImageId: number, hash: string }>

  process(dermCaseId: number): Promise<{ checksum: string }>

  fetchResult(dermCaseId: number, checksum: string):
    Promise<{ status: 'processed' | 'not_ready', body?: DermResultPayload }>
}

Implementation uses undici for HTTP and native FormData for multipart. Image bytes stream from the S3 presigned URL directly into the DERM multipart request without whole-image buffering.

Job orchestration

Two BullMQ job types backed by the existing @sa-platform/common Redis integration.

ai_review.start

  1. Load AiReview; confirm pending.
  2. Re-check consent (see §7).
  3. Resolve DERM session via DermSessionService (Redis-cached token).
  4. createCase() → persist dermCaseId, transition pending → in_progress.
  5. For each AiReviewImage:
  6. Open a streaming S3 GET to the presigned URL.
  7. uploadImage() → persist dermImageId, dermImageHash.
  8. process() → persist dermChecksum, transition in_progress → awaiting_result.
  9. Enqueue ai_review.poll_result with DERM_RESULT_INITIAL_DELAY_MS (default 5000).

ai_review.poll_result

  1. Load AiReview; confirm awaiting_result and not cancelled.
  2. fetchResult(dermCaseId, dermChecksum).
  3. If analysisStatus === 'STATUS_PROCESSED':
  4. Encrypt raw JSON with service-wide DEK → insert AiReviewResult.
  5. Insert AiReviewLesion rows, one per lesion in the response.
  6. Update each AiReviewImage.imageQuality from images[].
  7. Transition awaiting_result → completed.
  8. Emit ai_review.completed.
  9. Else (not ready):
  10. Increment attemptCount.
  11. If attemptCount > DERM_RESULT_MAX_ATTEMPTS (default 20) or elapsed wall time since awaiting_result transition exceeds DERM_RESULT_TIMEOUT_MS (default 600_000) → fail with derm_result_timeout, emit ai_review.failed.
  12. Otherwise re-enqueue with the next delay from the configured backoff schedule (default: [5s, 15s, 30s, 60s, 120s, 300s, 300s, …]).

Error taxonomy

Condition errorClassification Retriable?
401 on DERM call (first) derm_auth_transient Retry once after session invalidation + re-login
401 after re-login derm_auth_failed No
5xx / network timeout during start derm_upstream_transient Yes, BullMQ job retry (exponential, max 3)
4xx non-401 during start derm_client_error No
/result never STATUS_PROCESSED within attempt or time budget derm_result_timeout No
analysisResult !== PROCESSING_ANSWER_TYPE_SUCCESSFUL_PROCESSING derm_processing_failed No — record outcome, emit ai_review.failed
imageQuality === IMAGE_QUALITY_UNSUITABLE Not an error; emit ai_review.completed with quality flags
S3 presigned URL expired / 403 on fetch image_fetch_failed No — admin must re-trigger with fresh URLs
Consent withdrawn between emission and pickup consent_missing No — fail before any DERM call

Two layers of defense:

  1. Clinical-api, primary gate. Before emitting ai_review.requested, clinical-api checks that the case's product lists ai_analysis in required_consent_type_codes and the patient has a granted record for that consent type. No event emitted if the check fails. Uses the existing clinical-api ↔ consent-service integration pattern.

  2. ai-review, re-check at job pickup. When the ai_review.start worker picks up a job, it calls the consent service (GET /v1/consent-records?patientId=…&consentTypeCode=ai_analysis) to confirm granted status is still current. If the patient withdrew consent between clinical-api's check and ai-review's pickup, the job terminates with errorClassification='consent_missing'; no DERM call is made.

The consent type code used (ai_analysis — provisional name) is added to the consent service's seeded consent types if not already present. Implementation plan captures this as a seed-data task.


8. Idempotency and Supersede

Active-row invariant

"At most one non-superseded AiReview per (caseId, productId)." Completed and failed rows count as holding the slot until an explicit supersede. Enforced in application code:

BEGIN;
SELECT id, status FROM AiReview
 WHERE caseId = ? AND productId = ? AND status <> 'superseded'
 FOR UPDATE;
-- if a non-superseded row exists and supersede=false:
--   return that row, COMMIT, do not enqueue.
-- otherwise:
--   INSERT new row, COMMIT, enqueue.

Both the ai_review.requested event handler and POST /v1/ai-reviews route go through the same creation path, so duplicate triggers are absorbed identically.

Supersede flow

POST /v1/ai-reviews/:id/supersede (or POST /v1/ai-reviews with supersede=true):

  1. Transactionally:
  2. Mark the existing active row as superseded, set supersededAt.
  3. Insert a new AiReview row with a fresh id, pointing the old row's supersededById at the new row.
  4. Enqueue ai_review.start for the new row.
  5. Emit ai_review.superseded { oldAiReviewId, newAiReviewId, caseId, productId, orgId, supersededAt, reason? }.

Clinical-api consumes ai_review.superseded and marks the corresponding AI diagnosis rows as superseded — requires a small clinical-api schema addition (new diagnosis.status enum value superseded with a superseded_by_diagnosis_id nullable self-reference). Captured in the implementation plan.

Cancel

POST /v1/ai-reviews/:id/cancel is permitted only on pending, in_progress, awaiting_result. Marks the row cancelled; the next job pickup short-circuits. No cancel call is made to DERM (the API does not appear to support one; DERM's case record lingers on their side).


9. Security

Transport and auth

  • TLS on all inbound and outbound traffic.
  • Inbound: @sa-platform/auth-client JWT validation, scope enforcement.
  • Outbound to consent service: service-to-service JWT with appropriate scopes.
  • Outbound to DERM: credentials via POST /api5/login; session token applied as authorization header on subsequent calls.

Secrets

Enforced by the prod-secret pattern from PR #26. Dev-defaults rejected in production:

  • DERM_USERNAME, DERM_PASSWORD
  • AI_REVIEW_WRAPPED_DEK (service-wide DEK, wrapped by the configured key provider)
  • AI_REVIEW_KMS_KEY_ID, AWS_REGION (when AI_REVIEW_KEY_PROVIDER=kms)

Key provider

Adopts the KeyProvider abstraction from PR #27. AI_REVIEW_KEY_PROVIDER=local|kms (default local for dev); kms path uses KmsKeyProvider out of the box from @sa-platform/common. No new crypto code; mirrors the notifications service's adoption.

PHI

  • AiReviewResult.rawCiphertext is the sole PHI-at-rest field, encrypted with AES-256-GCM under the service-wide DEK. Same pattern as notifications' PHI fields.
  • Lesion projections, image projections, and AiReview metadata are plaintext as justified in §4.
  • Images flow from clinical-api S3 → ai-review (transit) → DERM (transit). Ai-review does not persist image bytes; the multipart request body is streamed and discarded.
  • DERM is a Skin Analytics internal product. The operator-facing implementation plan should confirm with compliance that a Data Processing Agreement covers the data flow; this spec assumes DERM is within scope of the platform's DPA.

Audit

  • Every AiReview state transition, every DERM API call (endpoint, status, latency, correlation id), and every REST request is logged via AuditModule with actor snapshot.
  • S3 object-locked audit archive follows the clinical-api pattern already documented in 2026-04-15-clinical-data-model-design.md.

10. Observability

Structured logs

Per-event fields: correlationId, aiReviewId, caseId, productId, orgId, status, phase (start | poll_result), dermEndpoint, dermLatencyMs, dermHttpStatus, attempt, errorClassification.

Metrics (Prometheus)

Metric Type Purpose
ai_review_jobs_total{status,errorClass} counter Throughput and failure mix
ai_review_duration_seconds{phase} histogram End-to-end latency of start and poll_result
derm_api_calls_total{endpoint,httpStatus} counter DERM call volume and status distribution
derm_api_latency_seconds{endpoint} histogram DERM latency
ai_review_result_attempts histogram Polling attempts until processed — tunes backoff
ai_review_queue_depth gauge BullMQ queue depth

Alerting (out of spec, in operator runbook)

  • ai_review_jobs_total{status="failed"} rate spike
  • derm_api_calls_total{httpStatus=~"5.."} sustained non-zero rate
  • Oldest awaiting_result row age > threshold
  • Queue depth growth unaccompanied by completion throughput

11. Configuration

Env var Default Purpose
DERM_API_URL DERM base URL (e.g. https://cc.derm.skin-analytics.com)
DERM_USERNAME Login username (prod-enforced)
DERM_PASSWORD Login password (prod-enforced)
DERM_REQUESTING_SYSTEM Ozone Sent on case creation
DERM_SESSION_TTL_SECONDS 1800 Cached session TTL in Redis
DERM_RESULT_INITIAL_DELAY_MS 5000 Delay before first /result poll
DERM_RESULT_MAX_ATTEMPTS 20 Max /result polls before timeout
DERM_RESULT_TIMEOUT_MS 600000 Wall-time budget from first poll
DERM_RESULT_BACKOFF_SCHEDULE_MS 5000,15000,30000,60000,120000,300000 Comma-separated backoff sequence; last value repeats
AI_REVIEW_KEY_PROVIDER local local | kms
AI_REVIEW_WRAPPED_DEK Wrapped service-wide DEK (prod-enforced)
AI_REVIEW_KMS_KEY_ID Required when AI_REVIEW_KEY_PROVIDER=kms
AWS_REGION eu-west-2 For KMS
CONSENT_SERVICE_URL For runtime consent re-check
BULL_REDIS_URL Shared @sa-platform/common Redis config

12. Testing Strategy

Mirrors the notifications service's split. No live DERM calls in CI; live-DERM smoke is manual at deploy time against a DERM staging environment.

Unit tests (Jest)

  • DermClient adapter with mocked undici: status-code branching, session-refresh on 401, multipart construction, streaming upload, error classification.
  • DermSessionService: Redis cache behaviour, TTL, invalidation on 401.
  • AiReviewService (orchestration): state transitions, idempotency (active-row invariant), supersede flow, cancel flow, retry classification, backoff math for poll_result.
  • Consent re-check: mocked consent-service client, blocked DERM call when revoked.
  • Event publishers: payload shape conformance.
  • Event consumer (ai_review.requested): triggers orchestration, idempotent on duplicate deliveries.

Integration tests (Jest + testcontainers or equivalent local compose)

  • Real MySQL (tenant-scoped), real Redis, mocked DERM via msw or nock.
  • End-to-end:
  • ai_review.requested event → start → uploads → process → result (processed) → ai_review.completed emitted with expected payload.
  • Consent withdrawal path → ai_review.failed with consent_missing; no DERM interaction.
  • 401 refresh path → session re-login, call retried, success.
  • Result-timeout path → ai_review.failed with derm_result_timeout after budget exhaustion.
  • Supersede flow → old row superseded, new row processes, ai_review.superseded emitted.
  • Cancel flow → in-flight row terminates on next worker pickup.
  • Idempotency: duplicate ai_review.requested events yield one review.
  • Image stream bridging: mocked S3 presigned URL → multipart body assembled correctly.

Manual smoke (deploy time)

A small operator script exercises POST /v1/ai-reviews against DERM staging, using a fixed validation image pair. Run per environment post-deploy to confirm live connectivity.


13. Operator Notes

Enabling ai-review in an environment

  1. Provision DERM credentials in the target environment's secret store; populate DERM_* env vars.
  2. Provision the service-wide DEK — generate 32 random bytes, wrap under the chosen provider (LocalKeyProvider or KmsKeyProvider), populate AI_REVIEW_WRAPPED_DEK and key-provider env vars.
  3. Register the ai-review:* scopes in the auth service and grant to the appropriate API client identities.
  4. Ensure the ai_analysis consent type exists in the consent service (seed on first deploy).
  5. Deploy the service, run manual smoke, confirm metrics/logs flowing.

Forward-only PHI stance (inherited)

Consistent with PRs #24, #27: there is no migration of pre-existing rows from a different crypto configuration. The service begins storing data under whatever key provider is configured at first deploy.


14. Out-of-Spec Follow-ups

  • Clinical-api → KmsKeyProvider migration — separate follow-up requiring per-request DEK cache design.
  • Admin UI — list, inspect, retry, view raw DERM response. Product-side.
  • Human-review service — the pair to this service; confidence-based escalation hooks remain TODO until human-review is built.
  • DERM-side webhooks — should DERM publish a push-completion webhook in future, replace polling with the inbound endpoint.
  • Per-product DERM configuration / multi-provider routing — when a second inference backend lands.
  • Rate limiting / token bucket — if DERM publishes hard quotas.
  • Historical backfill CLI — if backfilling becomes a recurring need.
  • FHIR translation — diagnosis rows written from ai_review.completed are already modelled FHIR-compatible; the translation layer itself is deferred platform-wide.