USSD (Unstructured Supplementary Service Data) is one of the most practical ways to deliver digital healthcare services in places where smartphones, mobile data, and reliable internet access aren’t always readily available. Across Africa, it serves as the backbone for everyday healthcare interactions, booking appointments, receiving follow-ups, checking insurance eligibility, and completing basic health screenings on simple feature phones.
From the user’s side, USSD feels straightforward, but behind the scenes, building systems that handle these interactions at scale is anything but simple. Sessions expire quickly, responses must be almost immediate, and the system has to deal with unreliable network behavior coming from telecom operators. A small architectural mistake can easily translate into dropped sessions, incomplete flows, or a system that works fine in testing but struggles when traffic increases.
This article focuses on the architectural thinking behind building a high-volume USSD systems for healthcare scenarios, and the practical engineering patterns that make these systems more resilient and easier to maintain in production.
Understanding the Core Constraints of USSD
Before discussing architecture, it’s important to understand the constraints that shape every technical decision in USSD systems:
- Session-based interactions: USSD sessions are stateful but short-lived, often expiring within seconds if responses are delayed.
- Strict latency requirements: Any delay in backend responses can terminate an active session.
- High concurrency: National or regional rollouts can result in tens or hundreds of thousands of concurrent sessions.
- Unreliable networks: Network interruptions, retries, and duplicate requests are common.
- No client-side persistence: Feature phones do not store application state, so every piece of session data must be tracked on the server side.
These constraints demand an architecture that prioritizes speed, fault tolerance, and horizontal scalability rather than traditional synchronous request-response patterns.
High-Level Architecture Overview
At a high level, the system is composed of:
- USSD Gateway Layer
- Flow Orchestration Layer
- Application Backend
- Session Manager
- Event & State Layer
- Observability & Error Tracking
Each component has a clearly defined responsibility. This prevents tight coupling and allows different layers of the system to scale independently.

USSD Gateway Layer
The gateway acts as the entry point into the system. It manages session lifecycle events, translates telecom protocols into something the backend can understand, and sends responses back to the user.
From an architectural standpoint, the backend never assumes delivery guarantees from the gateway. Every request is treated as potentially duplicated, delayed, or retried.
This assumption drives downstream design decisions such as:
- Idempotent handlers
- Session reconciliation
- Defensive state management
Flow Orchestration Layer
In scalable USSD platforms, interaction flows should not be tightly coupled to backend application logic. Instead of hard-coding menus and decision paths directly into services, a dedicated flow orchestration layer is introduced to manage user interaction sequencing.
The orchestration engine is responsible for:
- Service selection
- Question and menu sequencing
- Conditional branching and navigation
- Input validation and session progression
- Overall control of the user journey
Separating interaction flows from backend business logic becomes increasingly important as deployments grow in size and complexity. This architectural boundary allows product and operations teams to modify user journeys without requiring changes to core application services. It also prevents complex backend processes from introducing latency into active USSD sessions.
When backend processing or data persistence is required, the orchestration layer emits events to downstream services asynchronously rather than performing blocking synchronous calls during the USSD session.
This design approach delivers several operational benefits:
- Faster USSD response times
- Minimal synchronous dependencies during sessions
- Reduced coupling between interaction flows and backend services
- Improved system maintainability and operational flexibility
While this separation introduces additional architectural components, the long-term gains in scalability, reliability, and maintainability typically outweigh the added complexity in high-volume USSD environments.
Event & State Layer
One of the most effective ways to keep USSD systems responsive is to avoid performing heavy business logic during active sessions. Instead, user interactions can trigger asynchronous events that downstream services process independently.
In this model, the system publishes interaction events to a messaging layer. Multiple domain services subscribe and process them. Shared state is updated asynchronously while the user receives a minimal response that keeps the session alive.
Different technologies can support this approach. Some teams use in-memory messaging layers for speed, while others rely on distributed queues or streaming platforms. Each option has trade-offs around durability, ordering guarantees, and operational complexity.
Despite these trade-offs, event-driven processing makes it easier to scale horizontally and prevents individual services from becoming bottlenecks.
Backend Services
Backend services are designed to be stateless and horizontally scalable:
- No session data is stored in memory
- All session state is persisted externally in Redis
- Handlers are idempotent to safely process duplicate events
- Timeouts are handled explicitly
This makes it possible to:
- Scale services during traffic spikes
- Redeploy or replace instances without losing session continuity
- Reliably handle millions of USSD interactions
Handling Scale, Timeouts, and Failures
Session Timeouts
USSD sessions frequently expire mid-flow. To handle this:
- Partial progress is saved in Redis after each interaction
- Users can resume from their last completed step
- Backend logic tolerates incomplete or missing data
High Concurrency
Redis Pub/Sub allows the system to process thousands of concurrent events with minimal overhead, while backend consumers scale horizontally based on demand.
Fault Isolation
Failures in one downstream service (for example, appointment booking) do not collapse the entire flow. The flow orchestrator continues managing the session while backend services retry or flag failures asynchronously.
Observability & Error Tracking
In distributed, high-volume systems, failures are inevitable. What matters is how quickly they are detected and diagnosed. This is why Engineers need visibility into latency trends, session drop rates, and asynchronous processing failures.
Structured logging helps trace issues across distributed services. Distributed tracing allows teams to follow a user interaction through multiple asynchronous boundaries. Real-time alerts provide early warnings when performance starts to degrade.
This makes it possible to trace failures across asynchronous boundaries, identify systemic issues early, and improve reliability over time.
Why This Architecture Scales
This architecture scales by combining:
- Asynchronous, event-driven processing
- Stateless backend services
- Loosely coupled flow orchestration
- Explicit failure and retry handling
- Built-in operational visibility
Rather than treating USSD as a simple menu-driven channel, it is approached as a distributed systems problem, subject to the same rigor as high volume web or fintech platforms.
Closing Thoughts
USSD continues to play an important role in expanding access to healthcare services in low-connectivity regions. Building reliable platforms for this channel requires more than basic menu design. It involves thoughtful system architecture, careful handling of session state, and a realistic understanding of telecom infrastructure constraints.
The patterns described here are technology-agnostic and can be implemented with different tools depending on team preferences and operational context. What matters most is designing systems that remain responsive under load and continue working even when networks behave unpredictably.
In many ways, USSD reminds engineers that impactful technology is not always about cutting-edge tools. Sometimes it is about building dependable systems that meet people where they are and continue working under real-world conditions.