Why AI Struggles With Legacy Code and Institutional Knowledge

The 800 Billion Lines AI Can’t Touch

Last week, our platform had a P1.

Surveys going out under the wrong name — the system owner’s, not the project manager’s. A handful of customers affected. Small blast radius by some measures. Not small to the people who got the calls.

CSMs flagged it immediately. Slack lit up. One channel, then three. Engineers pulled in, support scrambling, escalation emails going out while we were still diagnosing. I watched the thread move in real time. Uncomfortable doesn’t quite cover it.

Four hours later, we found it.

An environment variable. Missing from a deployment. We’d shipped a big release the day before — clean, tested, signed off. Somewhere in that process, one variable didn’t make the trip. One engineer reproduced the issue in under an hour once we knew where to look. Fix deployed. Data corrected. Apology emails sent.

Done.

Why It Was Fixable

Here’s what I keep thinking about since.

Halfway through the incident, our CSM pinged me: a champion at one of our larger customers was really upset. I know him. I was on a different call — killed my audio, turned off my video, picked up my phone and texted him directly. Apologized. Told him we were on it. Promised an update fast.

I know what that moment feels like from his side. He’s the one who brought us in. His credibility inside his own company is on the line every time something like this happens.

It hurts seeing it. But I’ve been through this too many times. It’s software. Find it with the right urgency and focus and it gets fixed.

We found that bug in four hours because our codebase is modern enough to reason about. Deployment history is clean. The issue was reproducible. Trace it, isolate it, fix it — enterprise software’s unspoken contract: something breaks, something caused it, work hard enough and the answer exists somewhere in the logs.

I’ve spent fifteen years building and running enterprise software. Security systems protecting hundreds of millions of users. Payment infrastructure running billions of transactions. Every production incident followed the same grammar. Bug has a cause. Cause has a location. Location can be found.

That contract is baked so deep into how engineers think that most never name it. It’s just the rules.

But the contract depends on something the AI productivity conversation almost never mentions: institutional knowledge. Engineers who understand what they’re looking at. Documentation that exists. A codebase young enough that a human can still reason through it from start to finish.

Most of the world’s critical infrastructure doesn’t have that anymore.

The Other Kind

We have legacy code nobody wants to touch.

The developers who wrote it aren’t here anymore. When something changes in those areas, it breaks three other things. Tech debt accumulated before I arrived, before some of my engineers were in the workforce. Areas I call danger zones — not because we can’t eventually fix them, but because fixing them means untangling decisions made by people who aren’t available to explain why.

I’ve seen danger zones at every software company I’ve worked for or built. Every single one.

Southwest Airlines is what happens when the danger zones are the whole system.

Christmas 2022. 15,000 flights cancelled in 10 days. Forty years of patchwork. Engineers who understood the original logic had retired. Some had died. Documentation covered the current layer. Not what was underneath.

$825 million.

When institutional knowledge walks out the door, the contract walks with it. You can’t trace what you can’t read. A bug stops being a firedrill. It becomes a catastrophe.

My P1 last week was fixable because our core system is modern. Our danger zones? Different rules entirely.

The Number Nobody Quotes

800 billion lines of COBOL and legacy code are running global infrastructure right now.

95% of ATM transactions. The IRS has been modernizing since 2000 — still on COBOL. UK Parliament’s Treasury Committee flagged legacy banking systems as a systemic risk in March 2025. Federal agencies on systems 40 to 60 years old, per GAO’s FY2024 report.

And it’s not just government. Manufacturing lines. Hospital records. Insurance underwriting. Airline scheduling.

AI writes new code faster than it’s ever been written. Real. Not disputing it.

But those 800 billion lines were written before most working engineers were born. They don’t get faster because AI got smarter. They get older. And the engineers who knew them keep retiring.

The Measurement Gap

METR published research in July 2025 on AI software engineering capability. Worth reading carefully if you advise clients on tech modernization.

On greenfield tasks — new features, new modules, new projects — AI’s contribution is real. Headline numbers hold up.

Performance on large existing codebases? Different story. Ask any senior engineer how long it took to get productive on a legacy codebase they didn’t build. The answer is usually months. Not days. Because understanding why a system works the way it does requires more than reading the code. It requires knowing the decisions that existed before the code did — the constraints, the shortcuts, the things that got patched because someone left the company in 2003 and nobody wrote it down.

Pattern matching on syntax doesn’t solve that. Never will.

Every AI productivity gain you’ve read about was measured on new code. Your clients’ billing systems, scheduling software, compliance infrastructure — not new. The bottleneck didn’t disappear. It moved.

The Wrong Problem

Clients are asking directly now. “We read that AI is making developers faster. Why does this engagement cost more than last year’s?”

For greenfield work — new products, new integrations — that question has a different answer now. AI does make that faster. Worth acknowledging.

But modernization work? Legacy migration? The 40-year-old system that needs replacing before it becomes a Southwest moment? That work doesn’t get easier. It gets harder. The institutional knowledge gap keeps widening. Systems keep aging. Engineers who can reason through code they didn’t write, in languages most programs stopped teaching — those engineers are more valuable now.

Not less. More.

Before your next modernization proposal, ask the client one question: who on their team can explain the existing system to a new engineer in a single day? If nobody raises their hand, you’ve just found the real project risk. And the real budget conversation.

The expensive part was never writing new code. It was always untangling what already exists.

The Long Game

My P1 is resolved. We lost some trust with a few customers. Earned some back with how we handled it. Postmortem written, lesson logged.

Our danger zones are still there. And now we’re building agentic AI on top of them.

AI is changing what engineers can build. It is not changing what they need to understand.

The firms that win the next decade of digital transformation won’t be the ones who mastered new tools. They’ll be the ones who understood what those tools can’t reach.

That’s still most of what’s running.