Real-World Evidence and Biomarker AI

Published

May 24, 2026

Real-world evidence and biomarker AI sit where clinical data, molecular data, and therapeutic development meet. The opportunity is to learn from data generated outside tightly controlled trials and to discover biological signals that improve development decisions. The risk is that messy data, confounding, missingness, and weak context-of-use definitions can make an association look like evidence.

Learning Objectives

This chapter gives you a framework for evaluating RWE and biomarker AI claims. You will learn to:

  • Distinguish real-world data from real-world evidence and operational analytics from inferential evidence
  • Read FDA RWE materials and biomarker-qualification materials as context-of-use frameworks
  • Separate prognostic, predictive, pharmacodynamic, safety, surrogate, and companion-diagnostic biomarkers
  • Evaluate multi-omic biomarker discovery against data provenance, confounding, missingness, population fit, and validation design
  • Treat Tempus, Flatiron Health, Komodo Health, and Datavant/Aetion as platform context rather than proof of decision impact
  • Recognise when patient-level clinical decision support is outside this handbook’s scope

RWE data sources:

Source What it contributes Main caution
EHR-derived data Clinical events, labs, treatments, notes, outcomes Missingness, coding drift, documentation bias
Claims Longitudinal utilization and billing-coded events Limited clinical granularity
Registries Disease-focused structured follow-up Selection and site participation bias
Genomic and molecular data Biomarker and disease biology context Consent, assay versioning, population fit
Wearables and devices Dense longitudinal physiology Adherence, device drift, endpoint relevance
Imaging and pathology Tissue, disease, and treatment context Acquisition and annotation variation

Biomarker categories:

Category What it means Development question
Prognostic Associated with outcome regardless of treatment Who has higher baseline risk?
Predictive Associated with differential treatment effect Who benefits more from a therapy?
Pharmacodynamic Shows biological response to intervention Did the drug engage the pathway?
Safety Flags toxicity or risk Who is at risk of harm?
Surrogate Substitutes for clinical endpoint in a defined context Is the substitute endpoint validated?
Companion diagnostic Selects patients for a therapy Is there an approved or qualified test pathway?

Regulatory anchors:

Topic Source Practical reading
FDA RWE program FDA, 2026 RWE acceptability depends on context, data, design, and analysis
FDA RWE framework FDA, 2018 Cures Act framework for evaluating RWE use in drug development
Biomarker qualification FDA, 2026 Qualified biomarkers are tied to a specific context of use

Three failures that look like success:

Failure mode Looks like Actually means
Confounding-as-signal Strong association in EHR or claims Treatment selection, disease severity, or access explains the result
Biomarker overfitting Multi-omic model performs well internally Site, assay, or cohort drift may break replication
Context-of-use blur “RWE supports approval” The exact decision, endpoint, population, and analysis must be specified

Introduction

Randomized trials remain the central evidence source for therapeutic efficacy, but they do not answer every development question. Real-world data can inform natural history, feasibility, external controls, safety surveillance, treatment patterns, comparative effectiveness hypotheses, and post-approval evidence. Biomarkers can help identify risk, mechanism, treatment response, safety, or a defined surrogate endpoint.

FDA’s real-world evidence program and framework are explicit that context matters (FDA, 2026; FDA, 2018). The same dataset may support descriptive epidemiology, fail as causal evidence, and still be useful for trial design. The first question is not which model to use. It is what decision the evidence is meant to support.

Demonstrated

Real-World Data Curation

RWE work begins with data curation. EHR-derived data, claims, registries, pharmacy records, imaging, pathology, genomics, and device data all require provenance, harmonisation, missingness checks, and audit trails. AI methods can support abstraction, coding, endpoint extraction, de-identification, and entity resolution, but the evidence value comes from the curated dataset and analysis design.

The FDA RWE page and framework define the regulatory boundary: RWE may support new indications for approved drugs and post-approval study requirements in selected contexts, but suitability depends on data relevance, reliability, study design, and analysis plan (FDA, 2026; FDA, 2018).

Biomarker Discovery and Qualification

Biomarker AI spans discovery, measurement, validation, and qualification. A model may identify molecular, imaging, digital, pathology, or clinical features associated with outcome. The professional distinction is category and context: prognostic, predictive, pharmacodynamic, safety, surrogate, or companion diagnostic.

FDA’s Biomarker Qualification Program ties biomarker use to a context of use (FDA, 2026; FDA, 2026). Qualification does not mean a biomarker is universally valid. It means the biomarker is accepted for the defined purpose and conditions.

Context of use is the evidence anchor. Without it, a biomarker claim is only an association.

Multi-Omic Biomarker Integration

Multi-omic biomarker discovery combines genomics, transcriptomics, proteomics, metabolomics, pathology, imaging, and clinical data. The value is strongest when each modality answers a different biological question and the validation endpoint is specified. The risk is that the model learns batch, site, assay version, or cohort structure.

Strong multi-omic biomarker work includes sample-level provenance, assay versioning, missingness maps, population stratification, train-test separation by site or time, and orthogonal validation. If a biomarker is intended to predict treatment effect, prognostic performance is not enough.

Platform Category

Tempus, Flatiron Health, Komodo Health, and Datavant/Aetion illustrate the platform category for clinical, molecular, and real-world data in life sciences (Tempus, 2026; Flatiron Health, 2026; Komodo Health, 2026; Datavant, 2026). Tempus’s 2024 S-1 filing is useful industrial context for the clinical and molecular data platform thesis (Tempus AI, 2024).

Company sources establish category and positioning. They do not establish that a specific RWE or biomarker claim is decision-grade. For diligence, request data dictionaries, cohort definitions, endpoint definitions, missingness analysis, audit trails, and external validation.

Beyond Trial Operations

Clinical trial AI covers operational use cases such as recruitment, site selection, monitoring, and endpoint extraction. RWE and biomarker AI is a distinct evidence category because it asks whether non-trial data or model-derived biomarkers can support inference. The overlap is real, but the evidentiary burden differs.

The clean separation is this: trial operations improve execution; RWE and biomarker work may affect evidence. Evidence-affecting use carries a higher standard.

Theoretical

AI-Derived Predictive Biomarkers

AI-derived predictive biomarkers are plausible when the model identifies patients with differential treatment benefit. The challenge is causal: prognostic markers are easier to find than predictive markers. Demonstrating treatment-effect modification requires appropriate trial or quasi-experimental design, not only observational association.

External Controls and Synthetic Arms

RWE can support external-control or synthetic-control work in selected settings, especially rare disease or oncology contexts with strong natural-history data and ethical constraints on randomization. Theoretical value is high, but bias control is difficult. Eligibility alignment, calendar time, outcome definition, missingness, treatment changes, and unmeasured confounding decide credibility.

Learning Health Data Loops

The long-term promise is a learning loop where real-world data informs biomarker discovery, trial design, post-market safety, and label refinement. The practical barrier is governance: consent, privacy, data contracts, quality standards, model monitoring, and institutional accountability.

Beyond Current Capabilities

Causal Treatment Effects from Observational Models Alone

No observational model establishes treatment effect without design assumptions. Confounding by indication, immortal-time bias, measurement bias, missingness, and treatment selection remain central threats.

Universal Biomarker Transfer

A biomarker trained in one cohort, assay, ancestry mix, disease stage, or care setting should not be assumed to transfer. Transportability needs explicit evaluation.

Replacing Randomization by Default

RWE may support selected regulatory questions, but it does not replace randomization by default. The evidentiary standard depends on indication, intervention, endpoint, available data, and regulator engagement.

Practice Notes

Write the context of use before reviewing performance. Specify decision, population, endpoint, data source, time window, and acceptable uncertainty.

Separate descriptive, predictive, and causal claims. Do not treat a risk model, association model, and treatment-effect claim as interchangeable.

Audit data provenance. Record source systems, extraction logic, coding systems, missingness, assay versions, linkage quality, and de-identification steps.

Validate biomarkers by category. Prognostic, predictive, pharmacodynamic, safety, surrogate, and companion-diagnostic claims need different evidence.

Treat platform claims as diligence leads. Ask for cohort construction, data dictionaries, endpoint adjudication, validation reports, and governance controls.