USSD at Scale: Powering Healthcare Access Across Africa

USSD (Unstructured Supplementary Service Data) is one of the most practical ways to deliver digital healthcare services in places where smartphones, mobile data, and reliable internet access aren’t always readily available. Across Africa, it serves as the backbone for everyday healthcare interactions, booking appointments, receiving follow-ups, checking insurance eligibility, and completing basic health screenings on simple feature phones.

From the user’s side, USSD feels straightforward, but behind the scenes, building systems that handle these interactions at scale is anything but simple. Sessions expire quickly, responses must be almost immediate, and the system has to deal with unreliable network behavior coming from telecom operators. A small architectural mistake can easily translate into dropped sessions, incomplete flows, or a system that works fine in testing but struggles when traffic increases.

This article focuses on the architectural thinking behind building a high-volume USSD systems for healthcare scenarios, and the practical engineering patterns that make these systems more resilient and easier to maintain in production.

Understanding the Core Constraints of USSD

Before discussing architecture, it’s important to understand the constraints that shape every technical decision in USSD systems:

Session-based interactions: USSD sessions are stateful but short-lived, often expiring within seconds if responses are delayed.
Strict latency requirements: Any delay in backend responses can terminate an active session.
High concurrency: National or regional rollouts can result in tens or hundreds of thousands of concurrent sessions.
Unreliable networks: Network interruptions, retries, and duplicate requests are common.
No client-side persistence: Feature phones do not store application state, so every piece of session data must be tracked on the server side.

These constraints demand an architecture that prioritizes speed, fault tolerance, and horizontal scalability rather than traditional synchronous request-response patterns.

High-Level Architecture Overview

At a high level, the system is composed of:

USSD Gateway Layer
Flow Orchestration Layer
Application Backend
Session Manager
Event & State Layer
Observability & Error Tracking

Each component has a clearly defined responsibility. This prevents tight coupling and allows different layers of the system to scale independently.

High-level architecture of a scalable, event-driven USSD platform

USSD Gateway Layer

The gateway acts as the entry point into the system. It manages session lifecycle events, translates telecom protocols into something the backend can understand, and sends responses back to the user.

From an architectural standpoint, the backend never assumes delivery guarantees from the gateway. Every request is treated as potentially duplicated, delayed, or retried.

This assumption drives downstream design decisions such as:

Idempotent handlers
Session reconciliation
Defensive state management

Flow Orchestration Layer

In scalable USSD platforms, interaction flows should not be tightly coupled to backend application logic. Instead of hard-coding menus and decision paths directly into services, a dedicated flow orchestration layer is introduced to manage user interaction sequencing.

The orchestration engine is responsible for:

Service selection
Question and menu sequencing
Conditional branching and navigation
Input validation and session progression
Overall control of the user journey

Separating interaction flows from backend business logic becomes increasingly important as deployments grow in size and complexity. This architectural boundary allows product and operations teams to modify user journeys without requiring changes to core application services. It also prevents complex backend processes from introducing latency into active USSD sessions.

When backend processing or data persistence is required, the orchestration layer emits events to downstream services asynchronously rather than performing blocking synchronous calls during the USSD session.

This design approach delivers several operational benefits:

Faster USSD response times
Minimal synchronous dependencies during sessions
Reduced coupling between interaction flows and backend services
Improved system maintainability and operational flexibility

While this separation introduces additional architectural components, the long-term gains in scalability, reliability, and maintainability typically outweigh the added complexity in high-volume USSD environments.

Event & State Layer

One of the most effective ways to keep USSD systems responsive is to avoid performing heavy business logic during active sessions. Instead, user interactions can trigger asynchronous events that downstream services process independently.

In this model, the system publishes interaction events to a messaging layer. Multiple domain services subscribe and process them. Shared state is updated asynchronously while the user receives a minimal response that keeps the session alive.

Different technologies can support this approach. Some teams use in-memory messaging layers for speed, while others rely on distributed queues or streaming platforms. Each option has trade-offs around durability, ordering guarantees, and operational complexity.

Despite these trade-offs, event-driven processing makes it easier to scale horizontally and prevents individual services from becoming bottlenecks.

Backend Services

Backend services are designed to be stateless and horizontally scalable:

No session data is stored in memory
All session state is persisted externally in Redis
Handlers are idempotent to safely process duplicate events
Timeouts are handled explicitly

This makes it possible to:

Scale services during traffic spikes
Redeploy or replace instances without losing session continuity
Reliably handle millions of USSD interactions

Handling Scale, Timeouts, and Failures

Session Timeouts

USSD sessions frequently expire mid-flow. To handle this:

Partial progress is saved in Redis after each interaction
Users can resume from their last completed step
Backend logic tolerates incomplete or missing data

High Concurrency

Redis Pub/Sub allows the system to process thousands of concurrent events with minimal overhead, while backend consumers scale horizontally based on demand.

Fault Isolation

Failures in one downstream service (for example, appointment booking) do not collapse the entire flow. The flow orchestrator continues managing the session while backend services retry or flag failures asynchronously.

Observability & Error Tracking

In distributed, high-volume systems, failures are inevitable. What matters is how quickly they are detected and diagnosed. This is why Engineers need visibility into latency trends, session drop rates, and asynchronous processing failures.

Structured logging helps trace issues across distributed services. Distributed tracing allows teams to follow a user interaction through multiple asynchronous boundaries. Real-time alerts provide early warnings when performance starts to degrade.

This makes it possible to trace failures across asynchronous boundaries, identify systemic issues early, and improve reliability over time.

Why This Architecture Scales

This architecture scales by combining:

Asynchronous, event-driven processing
Stateless backend services
Loosely coupled flow orchestration
Explicit failure and retry handling
Built-in operational visibility

Rather than treating USSD as a simple menu-driven channel, it is approached as a distributed systems problem, subject to the same rigor as high volume web or fintech platforms.

Closing Thoughts

USSD continues to play an important role in expanding access to healthcare services in low-connectivity regions. Building reliable platforms for this channel requires more than basic menu design. It involves thoughtful system architecture, careful handling of session state, and a realistic understanding of telecom infrastructure constraints.

The patterns described here are technology-agnostic and can be implemented with different tools depending on team preferences and operational context. What matters most is designing systems that remain responsive under load and continue working even when networks behave unpredictably.

In many ways, USSD reminds engineers that impactful technology is not always about cutting-edge tools. Sometimes it is about building dependable systems that meet people where they are and continue working under real-world conditions.