Reflection-Driven Development in Pure C: Eliminating Boilerplate at Scale

Nobody starts writing a game engine in pure C without a good reason. It gives you total control over memory management, predictable performance, fast build times, and clear abstractions that are often missing in C++ projects with too many templates. But this control has a high cost. The main thing you spend is your own time, writing a lot of boilerplate for both the engine and the gameplay.

The moment your engine needs an Entity Component System (ECS), networking code, UI inspectors, and scene serialization, your daily workflow turns into a repetitive loop. Whenever you create a new component structure, you have to do all of this:

Write a specific my_component_serialize function.
Write a matching my_component_deserialize function.
Write the necessary GUI rendering code to display these fields in an UI inspector.
Manually register the size, memory alignment, and unique ID of the component within the ECS factory.

Did you add just one new field to your component? Now you have to go and update the code across four completely different files. This breaks the core DRY (Don’t Repeat Yourself) principle, which quickly leads to simple copy-paste mistakes, mismatched data, and annoying bugs.

If you write in C# or Java, runtime reflection solves this out of the box. Rust developers use procedural macros and the serde framework to generate the necessary code at compile time. C, however, provides absolutely no compile-time type information, making native runtime reflection impossible. But if the compiler won’t extract this metadata for us, we can generate it ourselves. This is the core idea behind what I call Reflection Driven Development (RDD).

The concept is strictly practical. We place custom macro tags on our regular C structs. Before the main compiler even starts, a standalone tool reads those source files. It parses our tags and automatically writes out all the repetitive C code and runtime metadata structures that we normally write by hand.

In this post, I will walk through building this exact setup. We are going to write a custom parser in C using libclang. This gives us full access to the Abstract Syntax Tree (AST) without dragging in massive C++ dependencies. Once this pipeline is in place, adding new engine features stops being a copy-paste nightmare.

Syntax: Creating a Custom Macro DSL

I started building this engine using the C11 standard. While the new C23 standard finally introduces native attribute syntax, adopting it right now is simply not a practical option. My goal is to support the widest possible range of platforms and older compilers. Because of this strict cross-platform requirement, I cannot rely on bleeding-edge compiler features just yet. This means there are no native attributes available, and I am strictly limited to standard preprocessor macros. But since I am using libclang to power my custom parser, I can take advantage of a highly useful compiler extension: __attribute__((annotate("...")))

This built-in feature is exactly what I need. It makes it possible to attach arbitrary string data directly to a struct or a variable node inside the Abstract Syntax Tree (AST). By wrapping this attribute inside a standard #define macro, I can keep the codebase completely valid. When compiling the actual game using GCC, MSVC, or standard Clang, I simply define the macro as empty, and the compiler ignores it entirely. However, when the standalone code generator scans the file, it reads the macro and captures the full string annotation.

Using this exact mechanism, I put together a simple, macro-based DSL (Domain Specific Language). This provides a clean syntax to tag engine structures and describe the runtime properties of their internal fields:

#ifdef REFLECTION_PASS
    #define ecs_component            __attribute__((annotate("ecs_component")))
    #define serializable             __attribute__((annotate("serializable")))
    #define non_serialized           __attribute__((annotate("non_serialized")))
    #define serialize_as(T)          __attribute__((annotate("@serialize_as " #T)))
    #define ui_limit(x, y)           __attribute__((annotate("@ui_limit " #x "," #y)))
    #define ui_slider(x, y)          __attribute__((annotate("@ui_slider " #x "," #y)))
    #define ui_tooltip(text)         __attribute__((annotate("@ui_tooltip " #text)))
    // ... 
#else
    #define ecs_component
    // ... and so on for the rest of the attributes
#endif

The core trick here relies on the standard C preprocessor stringification operator #. This operator takes any raw argument passed to the macro and converts it directly into a string literal. So, if I write ui_slider(1, 8), inside a struct, the preprocessor evaluates it and outputs __attribute__((annotate("@ui_slider 1,8"))).

I specifically added the @ symbol as a custom prefix marker. When my standalone parser scans the AST later, this character helps it quickly separate complex parameterized annotations from basic flag.

Thanks to this macro setup, adding new features to the engine is now an entirely declarative process. This approach goes far beyond just defining data for ECS components. It works just as well for UI logic, such as automatically binding specific C functions directly to elements in the editor’s context menu. Instead of writing manual registration boilerplate for every single system, I simply tag the relevant structs or functions, and it looks like this:

typedef struct ecs_component PhysicsSample {
    ui_slider(1, 8) int bounciness_level;
    float friction;

    non_serialized Manifold* collision_manifold;
    serialize_as(float) double mass;
} PhysicsSample;

context_menu_item("Create/Game Object") static void
spawn_new_game_object(EngineContext* ctx)
{
    // ...
}

2. Tooling: Why libclang Instead of a Compiler Plugin?

To extract reliable AST (Abstract Syntax Tree) data from the engine’s source files, I needed a proper, standard-compliant C frontend. At first, I took what seemed like the most elegant route: I wrote a custom Clang plugin in C++. The theory was great. The parsing and code generation would happen directly inside the main compilation pipeline, meaning I wouldn’t need to manage any external build steps.

However, tying my entire code generation process to a specific compiler completely contradicted my core goal of keeping the engine platform-agnostic. While Clang plugins are relatively easy to set up on Linux or macOS, dealing with them on Windows is incredibly frustrating. To build the plugin for Windows, I had to download and compile the entire LLVM source tree from scratch. If I kept this architecture, anyone trying to build the engine on Windows would have to go through the same massive and time-consuming process.

Furthermore, relying on a Clang plugin meant I was essentially forcing the use of Clang as the primary compiler everywhere, effectively abandoning native GCC and MSVC support. On top of these deployment issues, the internal LLVM C++ API does not guarantee any backward compatibility. Every time I updated my local compiler version, the internal API changed, the plugin stopped compiling, and I had to rewrite the integration.

I eventually threw out that entire C++ codebase. Instead, I rewrote the generator as a standalone command-line tool in pure C using libclang (the official C API for Clang). This approach solves the cross-platform and maintenance problems entirely. libclang provides a stable C ABI, meaning the functions do not change between compiler updates. It also links dynamically without requiring a full LLVM build. Most importantly, this architecture completely decouples the code generator from the main build process. The standalone tool acts only as a parser. Once it generates the necessary C metadata, I can compile the actual game engine using MSVC, GCC, or whatever compiler fits the target platform.

Parser Architecture: Keeping the Tooling in Pure C

Since the engine itself is written in pure C, I decided to keep the tooling infrastructure strictly within the same ecosystem.

At the core of the generator is the ParsingContext structure. This struct explicitly manages all memory buffers and data registries for the entire parsing pass:

typedef struct ParsingContext
{
    HTrie annotations_registry; // Prefix tree (Trie) for attribute lookups
    HTrie types_registry;       // Registry of all discovered types

    Span(TypeInfo) structs;     // Restricts metadata collection to structs (unions unsupported)
    Span(uint32) struct_flags;
    Span(FunctionInfo) functions;

    OffsetAllocator* allocator; // Custom linear memory allocator
} ParsingContext;

Efficient Attribute Lookups Using a Hash Trie and Bitmasks

When libclang encounters an annotationduring the AST traversal, it returns the attribute name as a raw C string. Doing a linear strcmp against a predefined list of attributes for every single field across the entire codebase would be too slow and inefficient for a pre-build step. To solve this, I initialize an HTrie (Hash Trie) when the tool starts up. This data structure helps to map the raw string names directly to a set of bitwise flags:

typedef enum AttributeType
{
    AttributeTypeNone = 0,
    AttributeTypeSerializable  = 1 << 0,
    AttributeTypeNonSerialized = 1 << 1,
    AttributeTypeSerializeAs   = 1 << 2,
    AttributeTypeEcsComponent  = 1 << 3,
    AttributeTypeUISlider      = 1 << 4,
    // ...
} AttributeType;

// Registry initialization
hash_trie_insert(registry, slice_char_from_cstr("ecs_component"), true)->payload = (uint64)AttributeTypeEcsComponent;
hash_trie_insert(registry, slice_char_from_cstr("serializable"), true)->payload = (uint64)AttributeTypeSerializable;

During the parsing phase, when the AST visitor function encounters an annotation node CXCursor_AnnotateAttr, I extract the string and look it up in the HTrie. If a valid match is found, the parser simply applies a bitwise OR |= operation to the configuration bitmask of the current struct field

HTrieNode* node = hash_trie_find(storage, attribute_prefix);
if (node)
{
    *attribute_mask |= node->payload;
    // Parse arguments if the annotation takes parameters (starts with '@')
}

Because of this setup, the actual code generation phase never has to process or compare strings. When the generator needs to check if a specific field should be skipped during serialization, it evaluates a cheap and basic bitwise operation, such as contains_any_flag(attributes, AttributeTypeNonSerialized).

Memory Management: Handling libclang String Allocations

The libclang C API is strictly designed around manual memory management, particularly when dealing with text. Whenever you query the AST and request a string, such as a structure name or a macro annotation, the library returns a specialized CXString object. The catch is that every single one of these objects must be explicitly freed by the caller using the clang_disposeString function.

If you miss even a few of these calls while parsing hundreds of source files, your lightweight CLI tool will quickly start leaking RAM. To prevent this without cluttering my parsing logic with endless cleanup code, I established a strict rule. As soon as the parser extracts a string, I immediately clone its contents into the memory pool of my own custom linear allocator OffsetAllocator. Once the text is safely copied into my contiguous buffer, I immediately dispose of the original libclang object. This approach keeps the API’s memory footprint completely clean, while giving the code generator fast, cache-friendly access to the data:

static Slice(char)
clone_clang_string(CXString str, OffsetAllocator* allocator)
{
    const char* cstr = clang_getCString(str);
    size_t length = strlen(cstr);

    Slice(char) result = (Slice(char)) {
        .data = (char*)offset_allocator_alloc(allocator, length + 1),
        .capacity = length,
    };
    memcpy(result.data, cstr, length + 1);
    return result;
}

    // ... later in the type/function visitor:
    CXString type_name = clang_getTypeSpelling(type);
    Slice(char) type_name_slice = clone_clang_string(type_name, ctx->allocator);
    clang_disposeString(type_name);

4. Traversing the AST: Sizes, Offsets, and Safety Guarantees

The main goal of building a reflection system is to fully automate the engine’s data pipelines. Because libclang acts as a real compiler frontend, when it builds the AST, it already calculates the exact memory alignment and the total byte footprint sizeof for every type. It does this while taking the specific target architecture into account, meaning we get perfectly accurate memory offsets right out of the box. I extract these offsets during the parsing phase to build the layout map.

static enum CXChildVisitResult
struct_field_visitor(CXCursor cursor, CXCursor parent, CXClientData client_data)
{
    TypeParsingContext* ctx = (TypeParsingContext*)client_data;
    FieldInfo* field = offset_allocator_alloc_t(FieldInfo, ctx->allocator);

    CXType field_type = clang_getCursorType(cursor);
    field->size = (uint32)clang_Type_getSizeOf(field_type);

However, there are edge cases in C that can break this process. For example, a struct might contain a Variable Length Array (VLA) or an incomplete type that was declared but not fully defined in the current translation unit. If you try to query the memory offset of a field in these situations, libclang cannot calculate it and will return a negative integer as an error code. To prevent the generator from producing corrupted offset data, I strictly check for these negative values and halt the parsing:

    long long offset = clang_Cursor_getOffsetOfField(cursor);
    if (offset < 0) {
        const char* error_reason = "Unknown Error";
        switch (offset) {
            case CXTypeLayoutError_Invalid: error_reason = "Not a valid field"; break;
            case CXTypeLayoutError_Incomplete: error_reason = "Incomplete type"; break;
            case CXTypeLayoutError_NotConstantSize: error_reason = "VLA or flexible array"; break;
        }
        meta_report_error(ctx->current_struct->name.data, "::", field->name.data, 
                             " field has invalid offset. Reason: ", error_reason, endl);
        exit(-1); // Hard stop the build process
    }
    field->offset = (uint32)(offset / 8); // Convert offset from bits to bytes

Handling Raw Pointers and Strict Validation

One of the biggest issues when writing any custom serialization system in C is dealing with raw pointers. If you naively serialize a structure containing a field like PathNode* current_target as a raw sequence of bytes, the physical memory address will be written directly to the file. Upon loading the game session later, that specific memory address will likely contain garbage data, and your engine will crash instantly with a segmentation fault upon pointer dereferencing.

How is this problem solved in the RDD paradigm? Our code generator acts as a ruthless, unforgiving compiler. During the parsing phase, we strictly classify all fields based on their core traits:

field->features |= (field_type.kind == CXType_Pointer || 
                   field_type.kind == CXType_IncompleteArray) 
                   ? FieldFeaturesPointer : FieldFeaturesNone;

If the parser detects that a structure is marked as serializable, but contains a field flagged with FieldFeaturesPointer internally, the generator immediately halts the build process and throws a fatal error.

To fix the broken build, it is necessary to specify how the reflection tool should handle that specific pointer by choosing one of the following options:

Ignore it: Mark the pointer with non_serialized. This is the standard approach for temporary runtime data, like cached pointers.
Deep Copy: Use a custom embed attribute, instructing the serializer to recursively traverse the reference and serialize the actual underlying data.
Custom Containers: If the data represents a complex structure (like a BVH tree or linked list), the entire wrapper structure is annotated with the serialize_container macro, specifying an extern function name. To link everything together for the reflection tool, I tag both the wrapper struct and the extern function declaration with the serialize_container attribute.

Furthermore, for the core data containers in my codebase, such as Slice(T), Span(T), Vec(T), I built native support directly into my tool. Since pure C lacks native templates, these generic containers are actually implemented using standard preprocessor token pasting (for example #define Span(T) Span_##T) When libclang encounters one of these types, the reflection tool recognizes it, extracts the underlying data type T and uses its memory traits to dynamically optimize the generated code. If T is a simple primitive (POD), it outputs an efficient memcpy call. If T is complex, it generates a precise for-loop to serialize each element individually.

5. Code Generation: Bringing Metadata to Life

After the reflection tool finishes parsing the AST, evaluating the attribute bitmasks, and collecting the structural data, it moves to the final phase: generating the actual code. The tool outputs standard header and source files.

Here is a simplified sample of the Ecs components metadata generation:

const size_t struct_count = span_size(&ctx->structs);
for (size_t i = 0; i < struct_count; ++i) {
    const uint32 attributes = *span_at(&ctx->struct_flags, i);

    // Generate reflection code exclusively for ECS components
    if (contains_any_flag(attributes, AttributeTypeEcsComponent)) {
        TypeInfo* type = span_at(&ctx->structs, i);

        file_write_str(&file, fmt_tmp(
            "EcsComponentReflection ", type->name.data, "_reflection = {n",
            "    .name = "", type->name.data, "",n",
            "    .size = ", type->size, ",n",
            "    .alignment = ", type->alignment, ",n",
            "    .is_pod = ", type->unsafe ? "false" : "true", ",n"
        ));

        // Filter fields: skip pointers and non_serialized fields
        foreach_node(it, list, list_iter_t(FieldInfo, &type->fields, list)) {
            FieldInfo* field = it.value;

            if (contains_any_flag(field->features, FieldFeaturesPointer) ||
                contains_any_flag(field->attribute_mask, AttributeTypeNonSerialized)) {
                continue; 
            }

            file_write_str(&file, fmt_tmp(
                "        (EcsFieldReflection) {n",
                "            .name = "", field->name.data, "",n",
                "            .type = "", span_at(&ctx->structs, field->type_handle)->name.data, "",n",
                "            .offset = ", field->offset, ",n",
                "            .size = ", field->size, ",n",
                "        },n"
            ));
        }
    }
}

During this generation step, the tool relies on a specific macro called fmt_tmp. Under the hood, this is a C11 _Generic macro, heavily inspired by the architectural ideas found in Jackson Allan’s “Convenient Containers” library.

The biggest advantage of the resulting architecture is that the final output is just a set of valid, human-readable C source files. If a data corruption bug occurs or the engine crashes, I can simply open the generated file, place a breakpoint inside the serialization loop, and step through the memory logic using a standard debugger, exactly like I would with any regular hand-written C code

Build Pipeline Integration (CMake)

For the main build pipeline and CI/CD builds, the reflection tool is integrated directly into the build system using CMake’s add_custom_command. It is set up as a strict pre-build step before the target engine binary compiles. This architecture ensures that the parser always scans the most recent source tree, completely eliminating the risk of human error and ensuring that the generated runtime metadata never falls out of sync with the actual codebase.

Conclusion

Reflection Driven Development completely changes how you write architecture in C. Yes, building this toolchain requires a serious upfront investment

However, the long-term architectural payoff is massive:

True DRY: You define a data structure exactly once. ECS registration, binary serialization, and UI inspectors are generated automatically.
Build-time Safety: It becomes impossible to forget a new field or accidentally serialize a raw pointer.
Transparent Debugging: The system compiles down to plain, readable C code. There is no reliance on complex SFINAE templates, linker hacks, or black-box macros.
Fast Iteration: Adding a new gameplay component takes only as long as defining the struct itself.

Many developers argue that pure C is too primitive for modern engines, heavily favoring C++ or Rust. But C provides a minimal foundation, strict memory control, and incredibly fast compilation times. If you miss high-level features like reflection, you have the power to just generate them.