Mapping personal data flows across a fictional SaaS organisation, assessing compliance of Records of Processing Activities under GDPR Article 30, and identifying four gap categories with structured remediation guidance.
A simulated data flow audit was conducted for Verada Health Tech (fictional SaaS organisation processing patient-adjacent data). Analysis of five core processing activities revealed that two lacked a documented lawful basis, one transferred personal data to a third-country processor without adequate safeguards, and none had retention periods formally defined in the ROPA. Risk is rated High for the third-country transfer and Medium for the remaining gaps. Four structured recommendations are provided with GDPR article references and ISO 27701 mapping.
Organisation: Verada Health Tech (fictional) — a B2B SaaS company providing workflow tools to physiotherapy clinics across the EU. The platform collects appointment data, practitioner notes, and basic patient identifiers on behalf of clinic customers, who act as data controllers. Verada is a data processor under GDPR Article 4(8).
The audit objective was to reconstruct the organisation's data flows from first principles, verify them against the existing ROPA (which had not been updated since 2022), identify gaps, and produce an updated gap register with remediation priorities.
Methodology followed a three-stage approach used in real DPO engagements: (1) data flow discovery via system inventory and interview simulation, (2) ROPA comparison against documented flows, (3) gap classification and framework mapping.
The following diagram reconstructs personal data flows across Verada's platform architecture. Five processing activities were identified. Third-party processors are highlighted where data leaves the EU/EEA.
GDPR Article 30 requires processors to maintain a Record of Processing Activities documenting, at minimum: the name and contact details of the processor, categories of processing, transfers to third countries, and where possible retention periods. The existing Verada ROPA was assessed against each required element.
| Processing activity | Lawful basis | Retention | 3rd country | DPA clause | Status |
|---|---|---|---|---|---|
| Appointment scheduling Name, DOB, contact |
Art. 6(1)(b) | Missing | No | Present | Partial |
| Email notifications Email address, appt ref |
Art. 6(1)(b) | Missing | USA (SendGrid) | SCCs signed | Partial |
| Usage analytics User ID, click events |
Undocumented | Missing | USA (Mixpanel) | No SCCs | Non-compliant |
| Support tickets Name, issue description |
Assumed Art. 6(1)(f) | Missing | EU (Zendesk) | SCCs signed | Partial |
| Staff access logs Employee ID, timestamps |
Art. 6(1)(c) | Missing | No | N/A | Partial |
The following script was used to load the ROPA as a structured dataset, calculate a completeness score per processing activity, and flag records below the compliance threshold. This approach demonstrates how data engineering skills apply directly to GRC evidence work.
import pandas as pd import json # Required ROPA fields under GDPR Art. 30(2) REQUIRED_FIELDS = [ 'processing_activity', 'data_categories', 'lawful_basis', 'retention_period', 'third_country', 'transfer_safeguard', 'dpa_clause_ref', 'processor_contact' ] df = pd.read_csv('ropa_verada_2022.csv') # Score completeness: 1 pt per non-null, non-empty field def completeness(row): filled = sum( 1 for f in REQUIRED_FIELDS if pd.notna(row.get(f)) and str(row.get(f)).strip() != '' ) return round(filled / len(REQUIRED_FIELDS) * 100, 1) df['score'] = df.apply(completeness, axis=1) # Flag high-risk gaps: missing safeguard on 3rd country transfer df['critical_gap'] = ( (df['third_country'].notna()) & (df['transfer_safeguard'].isnull() | (df['transfer_safeguard'] == '')) ) gaps = df[df['score'] < 75][['processing_activity', 'score', 'critical_gap']] print(gaps.to_string(index=False)) ## processing_activity score critical_gap ## Usage analytics 37.5 True ← HIGH RISK ## Support tickets 62.5 False ## Appointment scheduling 75.0 False ## Email notifications 75.0 False ## FLAG: Usage analytics — 3rd country transfer (Mixpanel/USA) ## with no Standard Contractual Clauses on file. GDPR Art. 46 breach. ## Immediate action required.
The most significant finding in this audit — the Mixpanel transfer without SCCs — was not visible in the ROPA at all. It was discovered by reconstructing the data flow from the system architecture and then checking each third-party integration against the documented record. This is where a background in data engineering changes the quality of a GDPR audit: rather than relying entirely on self-reported processing inventories, an auditor who understands how data pipelines actually work can identify what the inventory is missing.
The gap between what an organisation believes it does with data and what it actually does with data is where most regulatory risk lives. The ROPA audit is the mechanism for closing that gap — but only if the person conducting it knows what to look for at the pipeline level, not just the policy level.