Skip to content

Architecture

This document describes the deployment topology, data flows, and external data-sharing boundaries for the Skin Analytics clinical-data-model platform.

Note on topology accuracy: The network topology described below and shown in the diagram is the planned production topology. It reflects the intended AWS deployment architecture as captured in the codebase configuration. Confirm with the infrastructure team before treating it as a verified description of the live environment.


1. Deployment Topology

The platform is designed for deployment on AWS within a single VPC containing three subnet tiers:

See docs/diagrams/network-topology.d2 for the D2 source diagram. Render locally with bash scripts/docs/diagrams.sh after installing D2 (brew install d2).

Public subnet

  • Load Balancer — receives HTTPS traffic from the internet. TLS termination occurs here. Forwards HTTP to services in the private subnet.

Private subnet (services)

All application services run as ECS tasks in the private subnet. They are not directly accessible from the internet. The services are:

Service Role
auth OAuth 2.0 token issuance, JWKS, client management
clinical-api Core PHI store — patients, cases, findings, diagnoses, images
orchestrator Workflow definition and execution engine
ai-review AI triage — calls DERM AI, stores results
human-review Human reviewer queue and decision capture
consent Patient consent record management
notifications Email and Slack notification dispatch
user-management User accounts, org memberships, role/permission management

Storage subnet

  • RDS MySQL — each service has its own isolated MySQL database (separate schemas or separate RDS instances). No service reads another service's database directly.
  • ElastiCache Redis — used for inter-service event pub/sub, JWKS caching, and (in some services) distributed locking.

S3

  • Image bucket — clinical images and histology report files are stored in S3. Metadata (bucket name, key) is recorded in the clinical-api MySQL database. Image bytes never transit through the application tier; pre-signed URLs are used for upload and download.

2. Data Flow

The primary data flow for a clinical case is:

Client (browser/mobile)
  → HTTPS → Load Balancer
  → clinical-api
      → writes Patient/Case/SkinFinding/Image rows to MySQL
      → uploads image bytes to S3 (client uses pre-signed URL)
      → on case completion: publishes ai_review.requested event to Redis
  → orchestrator (subscribed via Redis)
      → creates WorkflowInstance, drives step execution
  → ai-review (triggered by orchestrator step)
      → calls DERM AI (external)
      → stores AiReview/AiReviewResult/AiReviewLesion to MySQL
      → publishes ai_review.completed event to Redis
  → clinical-api (subscribed)
      → writes Diagnosis rows from AI review lesion data
  → human-review (triggered by orchestrator step if required)
      → Review queued; clinician claims and submits decision
      → ReviewAuditLog written on each state transition
  → notifications
      → dispatches email (AWS SES) or Slack message on key events

A higher-level view of the architecture is shown in the architecture glance diagram (rendered inline for the SMT audience).


3. Region and Jurisdiction

TBD — verify with infra team.

The codebase configures an S3 region via the S3_REGION environment variable (default: us-east-1 in development). The KMS key ARN is passed via KMS_CMK_ARN. The actual production AWS region and data residency jurisdiction must be confirmed with the infrastructure team before committing to regulators.

The Organisation.region field (services/clinical-api/prisma/schema.prisma) exists to support future multi-region deployments but is not used for data routing in v1.


4. External Data Flows

The following data flows cross the platform boundary (i.e. data leaves the AWS VPC):

DERM AI (third-party AI provider)

  • Direction: Outbound from ai-review service.
  • Data transmitted: Clinical images (via DERM image upload API) and case metadata sufficient for AI analysis. The DERM case ID, image hashes, and analysis results are stored back in AiReview / AiReviewResult / AiReviewLesion tables.
  • Citation: services/ai-review/ — the DERM client module handles the integration.
  • Data protection: Verify DERM's data processing agreement and sub-processor status with the legal/DPO team. This is the primary third-party PHI transmission in the platform.

AWS SES (email notifications)

  • Direction: Outbound from notifications service.
  • Data transmitted: Notification content rendered from templates. Templates with phi = true in the Template model may include PHI fields. Recipient email addresses are resolved from user-management service.
  • Citation: services/notifications/prisma/schema.prismaTemplate.phi flag.

Slack Webhooks

  • Direction: Outbound from notifications service.
  • Data transmitted: Slack notification payloads. Operational/clinical alerts; PHI content in Slack messages should be reviewed against your Slack DPA.
  • Citation: services/notifications/prisma/schema.prismaChannel.slack enum.

AWS KMS

  • Direction: Outbound from clinical-api and ai-review services (production only).
  • Data transmitted: DEK bytes (32-byte random keys) are transmitted to KMS for wrapping/unwrapping. No PHI content is sent to KMS. KMS calls are within the AWS network boundary.
  • Citation: packages/common/src/crypto/kms-key-provider.ts.

Customer Webhooks

  • Direction: Outbound from clinical-api.
  • Data transmitted: Event payloads to customer-configured webhook URLs (WebhookSubscription model). The Event.payload field content depends on the event type; some event types may include case-level identifiers.
  • Citation: services/clinical-api/prisma/schema.prismaWebhookSubscription, WebhookDelivery models.
  • Note: Customer webhook endpoints are outside the platform boundary. HMAC signing is used (the secret field on WebhookSubscription) but TLS verification of the customer's endpoint is an operational concern.