Retention + GDPR Erasure¶
This document describes the data retention policy mechanism, the crypto-shredding erasure approach, the GDPR Article 17 right-to-erasure flow, legal holds, and data subject access requests (DSARs). All code citations are verified against the source.
1. Retention Policy Mechanism¶
Configuration model¶
Retention is governed by RetentionPolicy rows in the clinical-api database
(services/clinical-api/prisma/schema.prisma):
RetentionPolicy {
organisationId // tenant
entityType // e.g. "Patient", "Case"
retentionDays // how long to keep after the relevant date
action // default: "soft_delete"
active
}
Each Organisation can have one RetentionPolicy per entity type (unique constraint on
[organisationId, entityType]).
Enforcement¶
The retention module is located at
services/clinical-api/src/retention/retention.service.ts and is exposed via the
RetentionController (services/clinical-api/src/retention/retention.controller.ts).
Planned: automated nightly cron execution of retention policies is not shipped in v1. The data model, API endpoints, and service logic exist, but no scheduler or cron job invokes the retention scan automatically. Enforcement must currently be triggered out-of-band (e.g. via direct API call or an external scheduler).
Retention API endpoints (all require events:read scope in v1 — note: scope assignment here
may be reviewed in a future plan):
GET /v1/clinical-api/retention/policies
POST /v1/clinical-api/retention/policies
POST /v1/clinical-api/retention/legal-holds
DELETE /v1/clinical-api/retention/legal-holds/:id
POST /v1/clinical-api/retention/erase
2. Crypto-Shredding¶
Crypto-shredding is the mechanism used to make encrypted patient data unreadable without physically deleting every encrypted column. The process is:
- The patient's
encryptedDekcolumn in thePatientrow is set tonull. - Without the DEK, every AES-256-GCM ciphertext encrypted under that key becomes computationally unrecoverable.
- The patient row itself is soft-deleted (
deletedAt = now()).
Implementation in services/clinical-api/src/retention/erasure.service.ts:
// 1. Crypto-shred: null out the DEK
await this.prisma.patient.update({
where: { id: patientId },
data: { encryptedDek: null, deletedAt: new Date() },
});
The DekResolver (packages/common/src/crypto/dek-resolver.ts) reads encryptedDek on
every request; once it is null, all decrypt calls for that patient will throw
"No encryption key found for patient: <id>", making the data inaccessible at the application
layer.
The underlying key wrapping is provided by LocalKeyProvider (dev) or KmsKeyProvider
(prod) — see Security Model §3.
3. GDPR Article 17 Erasure (Right to Erasure)¶
The erasure flow is implemented in
services/clinical-api/src/retention/erasure.service.ts —
ErasureService.erasePatient(patientId).
The flow proceeds as follows:
-
Tenant verification: The requesting client's
organisationId(from the OAuth token) must match the patient'sorganisationId. Cross-tenant erasure is rejected. -
Legal hold check: If any
LegalHoldrow exists for the patient withreleasedAt = null, erasure is blocked and the reason is returned:
{ erased: false, reason: "Patient is under legal hold: <reason>" }
-
Crypto-shred:
encryptedDekis nulled anddeletedAtis set (see §2 above). -
S3 object deletion: The service enumerates all cases for the patient, then all findings per case, then all images per finding, and deletes each S3 object (
StorageService.deleteObject(bucket, key)). S3 deletion errors are swallowed to avoid partial failures blocking the DEK null. -
Histology report file deletion: S3 objects for all
ReportFilerecords attached to the patient's cases are also deleted.
The erasure endpoint:
POST /v1/clinical-api/retention/erase
Body: { "patient_id": "<uuid>" }
Synthetic-org bulk purge¶
A separate POST /v1/clinical-api/admin/organisations/:id/purge-data endpoint (admin-secret bearer)
clears every clinical record AND the org's S3 image / image-derivative / report-file
objects in one call. Refuses non-synthetic orgs with 403. Returns purgedCounts including
s3Objects (deleted) + s3ObjectsFailed (per-key errors that were logged but didn't
fail the transaction). Used by the simulator workflow to reset the synthetic tenant
between load runs without leaking 35 GiB of orphan MinIO objects.
What is retained after erasure:
Patientrow (soft-deleted,encryptedDek = null) — required for audit linkage.AuditLogrows — audit records are not deleted to preserve the audit trail.LegalHoldrows — remain for compliance history.PatientMergeHistory— merge history is retained for regulatory purposes.- Database rows for
Case,SkinFinding,Diagnosis, etc. — row shells remain but their encrypted content fields are rendered unreadable.
Limitation in v1: Encrypted content in ai-review (AiReviewResult.rawCiphertext) is not
deleted by the erasure service. The raw ciphertext uses the patient's DEK indirectly (it is
encrypted with a DEK fetched for the case), but the ErasureService does not explicitly
null or delete ai_review_result rows. The DEK nulling makes the ciphertext unreadable, but
the ciphertext bytes remain on disk. This should be addressed in a future plan.
4. Legal Holds¶
A LegalHold record prevents erasure for patients subject to litigation, regulatory
investigation, or other legal requirements. The model is defined in
services/clinical-api/prisma/schema.prisma:
LegalHold {
id PK
organisationId FK → Organisation
patientId FK → Patient
reason VARCHAR(255)
holdSince default now()
releasedAt DateTime? // null = active hold
}
A patient may have multiple holds. An erasure request will be rejected if any hold has
releasedAt = null.
Holds are managed via:
POST /v1/clinical-api/retention/legal-holds { patient_id, reason }
DELETE /v1/clinical-api/retention/legal-holds/:id (sets releasedAt = now())
5. Data Subject Access Requests (DSAR)¶
Planned: there is no dedicated DSAR export endpoint in v1.
A data subject's data can be retrieved by combining the existing read endpoints:
GET /v1/clinical-api/patients/:id— patient recordGET /v1/clinical-api/patients/:id/cases— casesGET /v1/clinical-api/cases/:id/findings— skin findings and diagnosesGET /v1/clinical-api/patients/:id/consents(via consent service)
This process is currently manual. An automated DSAR export endpoint (collating all data for a patient into a structured package) is flagged as future work.