What Firmware Execution Patterns Reveal: Detecting Anomalies in EDK2 Using Runtime Heatmaps

Firmware execution is one of the least observable parts of modern computing systems. In EFI-based systems, particularly in the Pre EFI (PEI) and Driver Execution (DXE) phases, critical system components are initialized and executed, yet developers rely on logs and intuition to understand what’s happening during these early stages.

This article explores a different approach: instrumenting EDK2 modules at runtime and visualizing their execution patterns across multiple runs. By constructing heat maps of module activity, it becomes possible to identify anomalies and inefficiencies that would otherwise remain hidden.

Introduction: Firmware Is the Least Observable Layer

Modern systems are built on layers of abstraction, each one adding functionality, flexibility, and complexity. By the time an operating system boots, most developers feel they have the tools and visibility needed to understand system behavior.

However, long before that point, a critical phase of execution has already taken place, one that is largely invisible and often overlooked. Firmware, particularly in UEFI-based systems, operates at one of the most privileged and least observable layers of the stack. During early boot phases such as Pre EFI (PEI) and Driver Execution (DXE), foundational system components are initialized, memory is configured, and execution paths are established. In spite of its criticality, firmware execution is rarely analyzed beyond basic logging and debugging.

This article explores an alternate approach to understanding firmware behavior. By instrumenting EDK2 modules during execution and gathering runtime data over multiple runs, it becomes possible to build a structured view of system behavior. By visualizing this data using heat maps, patterns, and more importantly, anomalous execution behavior, identification becomes more effective.

The Observability Problem in UEFI Firmware

Observability in firmware is fundamentally different from observability in higher layers of the software stack. For example, in the case of application and cloud environments, developers have access to rich debugging mechanisms like distributed tracing, metrics pipelines, structured logging, and real-time monitoring, which make it possible to analyze behavior across systems with a high degree of precision.

In contrast, firmware, particularly EFI-based firmware like EDK2, operates in a constrained and opaque environment. During PEI and DXE phases, execution is tightly coupled to hardware initialization, memory is limited, and traditional debugging tools are either unavailable or intrusive. As a result, developers often resort to serial logs, debug prints, and manual inspection and intuition to identify what is happening during boot.

While these techniques are useful, they have significant limitations:

They provide a linear, single-run view of the execution
They are difficult to compare across multiple runs
They do not easily surface subtle inconsistencies or non-deterministic behavior

This becomes particularly problematic when dealing with issues that do not manifest as clear failures. Small variations in module execution order, timing differences or intermittent behavior can accumulate into larger system-level problems, yet remain difficult to detect through conventional means. In practice, firmware debugging often becomes reactive – an issue is identified only after it becomes visible at a higher layer.

What is missing is a way to observe firmware execution as a system across multiple runs, over time, and with enough structure to identify patterns rather than isolated events.

A Different Approach: Instrumenting EDK2 Execution

To make firmware execution observable, it is necessary to move beyond a static understanding of the boot flow and instead capture how it behaves at runtime.

The EDK2 boot process spans multiple phases, including SEC, PEI, DXE, and BDS. While each plays a role in system initialization, the PEI and DXE stages are where the majority of platform setup occurs. During these phases, firmware modules are loaded and executed to initialize hardware components and prepare the system before control is transferred to the operating system.

EDK2 Execution Flow

Despite their importance, these stages are also among the least observable. Much of the execution is abstracted behind firmware interfaces, making it difficult to track how such modules behave in practice. A key observation is that both PEI and DXE operate by loading and executing modules in sequence. Each module completes its initialization before the next is loaded, creating a natural boundary between execution steps, providing a way to introduce some lightweight instrumentation.

One practical approach to introduce this instrumentation is the use of the serial (COM) ports, which are already commonly available in firmware debugging environments. By emitting the name of each module as it is loaded and executed, it becomes possible to reconstruct the sequence of execution externally on a debugging system.

Mechanism To Log The Modules Being Loaded

Because modules execute sequentially, this approach also allows for indirect measurement of execution characteristics. The time between successive outputs reflects how long a given module takes to complete its initialization, providing insight into execution variability without requiring intrusive instrumentation.

However, a single execution trace offers only a limited view. In order to extract meaningful insights, the system must be observed across multiple runs under consistent conditions. By repeating the boot process over multiple iterations and recording the execution data for each iteration. It then becomes possible to build a dataset that captures both consistency and variation in module behavior.

Turning Execution Data into Heat Maps

Once execution data has been collected across multiple runs, the next step is to transform it into a format that can be analyzed effectively. By capturing module names from the COM interface along with their corresponding timestamps, it becomes possible to reconstruct a timeline of firmware execution.

The Expected Graph Obtained By Plotting The COM Data With a Timestamp

Each run produces a sequence of timestamped events, representing the order and timing of module initialization during boot. When viewed individually, these timelines provide a view of how the system initializes over time. However, when execution data from multiple iterations is overlaid, a pattern begins to emerge – most modules consistently appear within a predictable time window. This behavior can be considered as a baseline for normal system execution.

Overlaid Plot Of Execution Data Across Multiple Runs

The challenge then is to represent this baseline in a way that captures both consistency and variation across runs. This can be addressed by aggregating the data into a heat map representation. In this model, the vertical axis represents individual modules, while the horizontal axis represents time intervals during execution. Each cell in the grid corresponds to the presence of a module within a given time range. Furthermore, by adding a gradient to each cell where lighter regions indicate lower frequency and darker regions indicate higher frequency, it becomes possible to visualize how consistently each module appears at a given point of time.

The Aggregated Heat Map

This approach condenses multiple execution traces into a single visual representation, effectively creating a “signature” of the expected system behavior.

What Firmware Execution Patterns Reveal

With a baseline “signature” of expected behavior established, it becomes possible to evaluate how consistently firmware execution aligns with this model. This can be verified by capturing execution data from an independent run, reconstructing its timeline and overlaying it onto the heat map. In a stable system, the observed execution should align closely with regions of higher intensity within the heat map.

The Independent Run (in Black) Overlaid on the Heat Map

These regions represent the most frequently observed execution patterns across runs, effectively defining the expected behavior of the system. When a new execution trace falls within these regions, it indicates that module initialization is occurring within predictable time windows and in a consistent order. This alignment is a strong indicator of deterministic behavior during early boot phases.

However, the value of this approach becomes more apparent when deviations occur. Once a baseline has been established, any divergence from these high density regions can signal inconsistencies in execution behavior. The next step is therefore to understand how these deviations can be identified and interpreted as anomalies.

Detecting Anomalies Across Runs

The effectiveness of this approach becomes most apparent when deviations from baseline behavior are observed. When a new execution trace is overlaid onto the heat map, certain regions may exhibit consistent patterns that deviate from an otherwise uniform progression. Rather than appearing as isolated anomalies, these deviations are often reflected both in individual runs and within the aggregated heat map itself.

In one such case, the execution timeline initially progresses with a relatively uniform slope, indicating consistent module initialization behavior. However, beyond a certain point, the slope begins to decrease, reflecting an increase in execution time. Ideally, this slope would remain consistent throughout, representing stable and predictable initialization.

The Decreased Slope Anomaly Detected From the Heat Map Data

What makes this observation significant is that this change in slope is not unique to a single run but is also present in the heat map, indicating that the behavior is consistent across multiple executions. Further analysis reveals that this transition aligns with the introduction of the Tcg2Dxe module. At this point the execution becomes less uniform, suggesting that the module introduces additional latency into the system.

This provides a clear direction for investigation. Instead of treating the behavior as an outlier, it can be understood as a repeatable characteristic of the system, allowing developers to focus on a specific component and its associated impact. A deeper investigation of the code and the EDK2 build configuration leads to the root cause – introduction of the Tcg2Dxe module re-enables a previously disabled platform execution parameter (PCD), specifically the variable measurement PCD which introduces an additional processing overhead with every module introduced, leading to an increased latency and an observed change in the execution slope.

Engineering Insights from Real-World Firmware Behavior

Analyzing execution behavior across multiple runs reveals an important characteristic of firmware systems: they are often assumed to be deterministic, but in practice their behavior can vary in subtle and non-obvious ways.

One of the key observations from this analysis is that not all deviations in execution are anomalies. Some variations, when consistently observed across runs, represent inherent characteristics of the system rather than isolated issues. These patterns often emerge from implicit dependencies, configuration parameters, or initialization overhead that are not immediately visible through traditional debugging methods.

In the case examined earlier, the introduction of the Tcg2Dxe module consistently resulted in a change in execution slope, indicating increased latency during that phase of initialization. Because this behavior was reflected both in individual runs and in the aggregated heat map, it was identified not as a transient anomaly, but as a repeatable system characteristic.

Further investigation revealed that this behavior was caused by the re-enabling of a previously disabled platform configuration parameter (PCD), specifically the variable measurement PCD. This change introduced additional processing overhead, leading to increased execution time for that portion of the boot sequence. By isolating this behavior and modifying the configuration, it was possible to directly observe its impact on system performance. Disabling the variable measurement PCD resulted in a measurable reduction in boot time (From 17.5 seconds down to 2 seconds) and restored a more uniform execution profile across runs.

After Disabling the Variable Measurement PCD - The Black Graph Now Follows a More Uniform Slope With a Reduction In Boot Time

This highlights a broader insight, firmware performance is often influenced by configuration-level decisions that may not be immediately apparent during development. Without structured observability, such effects can remain hidden within individual execution traces.

More importantly, this approach demonstrates how firmware behavior can be analyzed not just for correctness, but for consistency and efficiency. By treating execution as data and examining it across runs, developers gain a clearer understanding of how system components interact, where variability is introduced and how performance can be systematically improved. In many cases, the most significant performance issues are not caused by failures, but by consistent inefficiencies that remain hidden without the right form of analysis.

Beyond Debugging: Broader Applications

While the initial motivation for this approach was to improve firmware debugging, its usefulness extends beyond identifying individual issues.

One immediate application is regression detection. By comparing execution heat maps across different firmware builds, it becomes possible to identify changes in behavior that may not surface as functional failures. Even when a system appears to operate correctly, shifts in execution patterns such as increased latency in specific modules or changes in initialization order can indicate unintended side effects introduced during development.

Another important application is performance analysis and optimization. By examining execution patterns across runs, developers can identify modules that consistently introduce delays or variability. Instead of relying on coarse-grained profiling, this approach provides a fine-grained view of how individual components contribute to overall boot time. As demonstrated earlier, this can directly lead to targeted optimizations and measurable improvements in system performance.

There are also potential security implications. Firmware execution is a critical part of the system’s trust chain, and unexpected behavior during early boot phases can signal a misconfiguration or an unintended code path. By establishing a baseline of expected execution behavior, deviations, whether transient or consistent, can serve as indicators for further investigation. While this approach is not a substitute for formal security analysis, it provides an additional layer of visibility into system behavior.

Finally, this methodology opens the door to continuous validation of firmware systems. By integrating execution data collection into automated testing workflows, it becomes possible to monitor changes in behavior across builds and environments. Over time, this can evolve into a form of behavioral profiling, where expected execution patterns are treated as part of the system’s validation criteria.

Taken together, these applications highlight a broader shift – firmware execution can be treated not just as a sequence of events to debug, but as structured data that can be analyzed, compared, and validated over time.

Rethinking Firmware Observability

Firmware has traditionally been treated as one of the least observable layers of the system stack. While higher layers benefit from mature tooling such as tracing, metrics, and structured logging, firmware debugging still relies heavily on linear logs and manual inspection, which creates a gap between what the system is doing and what developers are able to observe.

The approach explored here suggests a shift in perspective. Instead of treating firmware execution as a sequence of events to be inspected after the fact, it can be viewed as structured data that can be collected, aggregated, and analyzed across runs. By focusing on patterns rather than individual traces, it becomes possible to reason about firmware behavior in terms of consistency, variability, and performance characteristics, which enables a more proactive approach to debugging and validation, where potential issues can be identified before they manifest as failures.

More importantly, this shifts firmware from being an opaque layer to one that can be systematically understood. As systems continue to increase in complexity, improving observability at the firmware level will be essential for ensuring reliability, performance and security.

Conclusion: From Traces to Insight

Firmware execution is often treated as a black box, something that initializes the system and is rarely examined beyond basic logging. As a result, many aspects of its behavior remain hidden, surfacing only when they lead to visible failures or performance issues.

In this work, we explored an alternative approach – instrumenting firmware execution, collecting data across multiple runs, and transforming that data into a structured representation using heat maps. This makes it possible to move beyond individual traces and instead analyze execution behavior in terms of patterns, consistency, and variation.

Through this process, it becomes possible not only to identify anomalies but also to uncover consistent inefficiencies and hidden dependencies within the system. As demonstrated, even configuration-level changes, such as the reactivation of a platform configuration parameter, can have a measurable impact on execution behavior and overall system performance.

More importantly, this approach shifts the role of observability in firmware development. Rather than relying solely on reactive debugging, developers can begin to reason about execution behavior proactively, using data-driven methods to guide investigation and optimization.

Firmware is not simply a layer beneath the operating system; it is a system in its own right. By treating its execution as something that can be measured and analyzed, we move from isolated traces to actionable insight, enabling a deeper understanding of how systems behave during the earliest stages.