Public Health Engagement & Data Portal - System Design

This document presents a comprehensive system design for a Public Health Engagement & Data Portal designed to address the critical challenge of unreliable connectivity in African public health programs. The system enables community health workers to record patient encounters, log intervention activities, and coordinate with teams regardless of network availability, with intelligent synchronization when connectivity returns.

System Architecture Diagram

Executive Summary

The platform treats offline operation as the default, not an edge case. Field workers can operate fully offline, with intelligent conflict resolution and synchronization when connectivity returns. The system integrates with national health information systems like DHIS2 and supports multiple organizations (ministries of health, NGOs, research institutions) from shared infrastructure while maintaining strict data isolation.

The design addresses real-world constraints: inconsistent power, shared devices, varying technical literacy, and the need to integrate with legacy EMR systems. The architecture follows FHIR R4 standards for health data interoperability, ensuring compatibility with modern health information systems.

Technology Stack

Layer Technology Rationale
Frontend React + React Native Single codebase expertise, strong PWA/offline support
Backend Node.js Team familiarity, good async I/O for sync operations
Primary Database PostgreSQL Mature, PostGIS for mapping, row-level security for tenants
Sync Queue DynamoDB Handles burst writes during mass sync events
Event Streaming Kafka Reliable delivery for health data—guaranteed ordering
Cloud Platform AWS Best regional presence near Africa (eu-west-1, af-south-1)

Key Architectural Decisions

Dedicated Sync Service

Synchronization is the most complex part of the system. Conflict resolution, queue management, and retry logic are isolated in a dedicated service that can scale independently. When 500 field workers come back online after a regional network outage, the Sync Service handles the burst load without impacting other services.

Kafka for Event Streaming

Health data updates require guaranteed delivery with ordering. If a patient record is created and then updated, those events must be processed in sequence. Kafka's partition-based ordering provides this guarantee. While operationally complex, the data integrity requirements justify the choice over simpler alternatives like SQS or Redis pub/sub.

DocumentDB for FHIR Resources

FHIR resources are deeply nested JSON documents that don't map cleanly to relational tables. The system stores canonical FHIR JSON in DocumentDB while maintaining searchable indexes in PostgreSQL. This dual-storage approach avoids constant transformation between FHIR JSON and relational schemas.

Offline-First Design

The system is designed assuming the client will be offline most of the time. This is the defining technical challenge:

  • Every feature works without connectivity - If a feature requires real-time server communication, it's either redesigned or explicitly marked as online-only
  • Local storage is the source of truth - When offline, the app reads from and writes to IndexedDB. Users don't see spinners or error states
  • Background incremental sync - When connectivity appears, the app syncs in the background without interrupting workflow, only syncing what changed

Conflict Resolution Strategy

Conflicts are inevitable when multiple users edit the same record while offline. The system uses a tiered resolution approach:

Automatic resolution (most cases):

  • Different fields edited → merge both changes
  • Same field, same value → no conflict
  • Metadata changes → server always wins
  • User role hierarchy → clinician > nurse > field officer > data clerk

User resolution (when automatic resolution fails):

  • Same field, different values, same role level
  • Deletions conflicting with edits
  • Any conflict involving diagnosis or treatment data

Multi-Tenancy Architecture

The system supports multiple organizations on shared infrastructure with strict data isolation. A ministry of health must never see an NGO's data, even if both operate in the same country.

Approach: Separate Schemas

  • Each tenant gets their own PostgreSQL schema with identical table structures
  • Strong isolation - a bug in a query can't accidentally leak data across schemas
  • Operational costs remain reasonable compared to separate databases per tenant
  • For large tenants or strict data residency requirements, dedicated databases can be provisioned

Authentication flow extracts tenant_id from JWT tokens, and all database connections are scoped to that tenant's schema. Row-level security policies act as a defense-in-depth safety net.

External Integrations

DHIS2 Integration

Most African ministries of health use DHIS2 for health information management. The integration:

  • Pushes aggregate data (e.g., "47 children vaccinated in District X this week")
  • Pulls reference data (facility lists, organization units, data element definitions)
  • Handles the reality that every DHIS2 instance is configured differently

An abstraction layer with configurable field mappings per tenant allows manual mapping during onboarding, as DHIS2's flexibility makes full automation impractical.

EMR Systems

The system supports multiple EMR integration patterns:

  • FHIR R4 APIs for modern EMRs that expose them
  • HL7 v2 messages for legacy systems (via translation layer)
  • Flat file imports as a last resort

The system acts as a consumer, not a producer. The EMR remains the source of truth for clinical data.

Security & Compliance

Health data security is designed as a core concern, not an afterthought:

  • Encryption everywhere: AWS KMS-managed keys for databases and S3, TLS 1.3 for all connections, field-level encryption for patient identifiers
  • Comprehensive audit logging: Every data access logged (who, what, when, where) with 7-year retention in immutable CloudWatch Logs
  • Multi-factor authentication: Required for admin and clinician roles
  • Compliance support: HIPAA (US-funded programs), GDPR (EU-funded programs), and local data protection laws with per-tenant configuration

Performance & Scalability

The system is designed to scale from pilot deployments (50 users) to national deployments (100,000+ users):

  • Auto-scaling: AWS ECS with automatic capacity increases during load spikes (e.g., Monday mornings when field workers sync weekend data)
  • Database scaling: PostgreSQL read replicas, DynamoDB on-demand capacity, Redis cluster mode
  • Performance targets: API response time (p95) <500ms, sync initiation <2 seconds, dashboard load <3 seconds

Implementation Approach

The system will be built in five phases, each delivering usable functionality:

  1. Phase 1: Foundation - Core infrastructure, authentication, basic health data storage
  2. Phase 2: Field Operations - Event tracking, GIS, notifications, web frontend
  3. Phase 3: Offline & Mobile - Sync service, mobile app, offline support (the hardest phase)
  4. Phase 4: Integrations - DHIS2, EMR connections
  5. Phase 5: Polish & Scale - Analytics, performance optimization, security hardening

Total effort is estimated at 50-65 person-months, with a team of 4-5 engineers for core development, plus domain expertise and compliance support.

Complete System Design Document

The full system design document (18 sections, covering all aspects from architecture to operations) is embedded below. You can view it directly in your browser or download it for offline reading.

If the PDF doesn't display in your browser, you can open it in a new tab or download it directly.