We Built Risk Adjustment Software That Deletes Codes, Not Just Adds Them. Here’s Why

Risk adjustment software is a multi-billion-dollar market, and it’s still growing. That growth has attracted dozens of vendors, all claiming AI-powered accuracy and audit readiness. Most of those claims are noise.

What’s changed in 2026 isn’t technology. It’s the consequences. The Department of Justice collected $117.7 million from Aetna in March for submitting unsupported diagnosis codes and failing to remove them. Kaiser Permanente settled for roughly $1 billion on similar allegations. OIG audits published that same month found error rates between 81% and 91% across three Medicare Advantage organizations. The government isn’t warning anymore. It’s collecting.

If you’re evaluating risk adjustment software right now, the question isn’t “which tool finds the most codes?” It’s “which tool helps me prove every code I submit?”

This article breaks down what risk adjustment software does, what capabilities actually matter in the current enforcement climate, and where the category is headed.

What Risk Adjustment Software Does

Risk adjustment software analyzes clinical documentation, claims data, and patient records to identify, validate, and manage diagnosis codes used for reimbursement in value-based care programs. In Medicare Advantage alone, CMS pays private insurers over $530 billion annually, with payments adjusted up or down based on how sick each member is. The accuracy of those risk scores determines whether a health plan gets paid fairly, overpaid, or underpaid.

The core workflow looks like this: ingest medical records and claims data, use AI to identify diagnosis codes (specifically HCC codes, or Hierarchical Condition Categories), validate those codes against clinical evidence in the documentation, and present results to human coders for final review and submission.

Good software does this faster and more accurately than manual processes. Great software does it in a way that survives an audit.

The Compliance Shift Nobody Can Ignore

For years, risk adjustment was treated as a revenue function. Health plans hired vendors to mine charts, find missed diagnoses, and add codes that increased their Risk Adjustment Factor (RAF) scores. The more codes you added, the more CMS paid. Simple math, and it worked for a long time.

That model is now a liability.

The DOJ’s case against Aetna is instructive. Aetna ran a chart review program in payment year 2015 where it hired coders to review medical records. Those reviews turned up additional codes (which Aetna submitted) and unsupported codes (which Aetna kept in place). The government’s argument: using chart reviews to add codes while ignoring results that showed overcoding is evidence of intent to inflate payments.

This is the pattern that should worry every health plan running a retrospective program: if your software only adds codes and never removes them, you’ve built exactly the kind of system regulators are targeting.

Three OIG audits released in March 2026 reinforce the point. One audit of a major southeastern Medicare Advantage organization (A-07-22-01207) found a 91% error rate on high-risk diagnosis codes, with 100% error rates for acute stroke and acute myocardial infarction. Most errors came from history-of conditions coded as active diagnoses. A second audit found 84% error rates. A third, 81%.

These aren’t outliers. They’re the baseline.

The Five Capabilities That Actually Matter

After working with health plans, ACOs, and provider organizations processing millions of patient records, here’s what separates useful risk adjustment software from expensive shelfware.

1. Two-Way Coding (Adds and Deletes)

This is the single most important capability to evaluate, and one that’s hard to find in the market.

Two-way coding means the software identifies codes that should be added (legitimate diagnoses missed in claims) AND codes that should be removed (diagnoses in claims that lack clinical support). Every compliance-first program needs both directions.

The Aetna settlement makes the stakes clear: $106.2 million of the $117.7 million penalty traced back to a chart review program that only worked in one direction. The enforcement pattern leaves little room for interpretation: supplemental data submission is expected to work in both directions. Your software should too.

At RAAPID, two-way coding isn’t an optional feature. It’s the default. Our Neuro-Symbolic AI identifies underclaimed codes (potential adds), overclaimed codes (potential deletes), and properly claimed codes in a single review cycle. This isn’t a philosophical choice; it’s a direct response to how enforcement actually works.

2. Explainable AI with Evidence Trails

“AI-powered” is now table stakes. Every vendor in the market claims some form of artificial intelligence. The question that matters: can the AI show its work?

When a diagnosis code gets flagged for review, the software should trace that recommendation back to specific clinical evidence in the medical record. This means linking each code to documentation that meets MEAT criteria (Management, Evaluation, Assessment, and Treatment), the standard CMS uses to validate HCC diagnoses.

If the AI can’t explain why it suggested a code, that code is indefensible under audit. Full stop.

RAAPID’s Neuro-Symbolic AI combines neural networks with symbolic reasoning, including knowledge graphs and rule-based clinical logic. This architecture produces a transparent decision trail for every code suggestion: here’s the evidence in the chart, here’s how it maps to the diagnosis, and here’s why it meets or fails MEAT criteria. Unlike pure NLP or generative AI approaches, neuro-symbolic systems are far less prone to hallucination because outputs are constrained by knowledge graphs and clinical rules. They don’t require a human to reverse-engineer the AI’s reasoning after the fact.

3. RADV Audit Readiness

CMS launched payment year 2020 RADV audits in February 2026, with audits now running on a quarterly cadence. The agency restored its five-month medical record submission window and is using variable sample sizes (35 to 200 enrollee-years per audit). CMS has also announced plans to scale its coding workforce from roughly 40 to approximately 2,000 certified coders and is using AI as a support tool for its own audit process.

Translation: audits are faster, bigger, and more frequent than ever.

Software that isn’t purpose-built for RADV preparation leaves health plans scrambling when the audit notice arrives. Look for platforms that offer centralized audit management, real-time progress tracking, CMS-compliant report generation, and the ability to manage concurrent audits from a single interface.

RAAPID’s RADV Audit Solution functions as a command center for the entire audit lifecycle, from initial notification through evidence submission and response. Every chart reviewed through RAAPID’s retrospective workflow already carries the MEAT-validated evidence trail needed for audit defense. The audit module doesn’t create defensibility after the fact; it surfaces documentation that was validated during the coding process itself.

4. Prospective and Retrospective Coverage

Retrospective review (looking back at medical records after encounters) is necessary, but it’s no longer sufficient on its own. CMS has signaled a clear preference for encounter-driven documentation, diagnoses confirmed during actual patient visits rather than discovered months later through chart mining.

The safest diagnosis is one documented at the point of care by the treating clinician.

Prospective risk adjustment software supports providers before and during visits by surfacing conditions that need recapture, flagging care gaps, and providing clinical decision support in the EHR workflow. This isn’t about telling doctors what to code. It’s about making sure relevant clinical information is available when the provider is actually with the patient.

RAAPID offers both prospective and retrospective solutions under a single Clinical AI Platform. The prospective solution analyzes two years of patient data from charts, claims, and lab reports to create pre-visit summaries. The retrospective solution handles chart review, code validation, and audit preparation. Used together, they cover the full risk adjustment lifecycle: prospective captures diagnoses at the safest point (the encounter), and retrospective cleans up what was missed and removes what shouldn’t be there.

5. Security and Deployment Flexibility

Risk adjustment software processes protected health information (PHI) at massive scale. Security certifications aren’t optional.

Look for HITRUST certification and SOC 2 Type II compliance at minimum. Not all vendors in the market carry both. And ask about deployment options: can the platform run in your cloud environment, or does it require you to send PHI to a vendor-controlled infrastructure?

RAAPID holds both HITRUST and SOC 2 Type II certifications and deploys on Azure, AWS, GCP, or within a customer’s own cloud environment. That kind of flexibility matters for enterprise buyers who need to align with existing IT governance and security policies.

What to Watch Out For

A few red flags to keep in mind during vendor evaluations.

Accuracy claims on small sample sizes. Nearly every vendor will show you 98% accuracy on a pilot of 200 charts. That number is meaningless at production scale. Ask for performance data on 2,000 or more charts processed without human intervention. If the vendor can’t provide that, the “AI” is really human QA with a software wrapper.

Add-only retrospective programs. If the vendor’s retrospective solution identifies codes to add but never flags codes to remove, you’re building the exact risk profile the DOJ just penalized Aetna for. Ask specifically: “Does your platform identify overclaimed codes?” If the answer involves hedging, walk away.

Opaque AI. If the vendor can’t explain the technical architecture behind their AI, that’s a problem. “We use AI” tells you nothing. Ask whether they use NLP, machine learning, generative AI, or something else. Ask how the system explains its code recommendations. Ask for a sample evidence trail. If the demo only shows the recommendation without the reasoning, assume there isn’t any.

No RADV module. Some platforms treat RADV audit preparation as a consulting add-on rather than a built-in capability. Given the quarterly audit cadence, RADV readiness should be native to the platform, not bolted on after the fact.

Where the Category Is Headed

The OIG’s updated Medicare Advantage Industry Compliance Program Guidance, released in February 2026, signals where enforcement is going. The guidance flags chart reviews, in-home health risk assessments, and EHR prompts as practices that warrant close oversight. It warns that failing to remove unsupported codes is a compliance failure. And it explicitly calls out the need for MAOs to review AI and software tools used in the coding process.

Meanwhile, MedPAC’s March 2026 report found that Medicare spends 14% more on MA enrollees than it would under traditional fee-for-service, totaling roughly $76 billion in excess payments, with coding intensity driving $22 billion of that gap. Bipartisan legislation (the No UPCODE Act, S.1105) would exclude chart review and health risk assessment diagnoses from risk adjustment entirely, with CBO estimating $124 billion in savings over 10 years.

The direction is unmistakable. Risk adjustment is shifting from a revenue function to a compliance function. Software built to maximize code capture will become a liability. Software built to prove clinical accuracy and documentation integrity will become essential.

RAAPID was built for this shift. Not because we predicted it, but because defensible coding was the founding principle from day one. Two-way retrospective review, MEAT-validated evidence trails, prospective support at the point of care, and purpose-built RADV audit management aren’t features we added in response to enforcement pressure. They’re why the platform exists.

If you’re evaluating risk adjustment software in 2026, start with this question: will this tool help me prove every code I submit, or will it just help me find more codes to submit? The answer determines whether you’re building a program that survives the next five years, or one that becomes a case study in what not to do.

:::tip
This story was distributed as a release by Sanya Kapoor under HackerNoon’s Business Blogging Program.

:::