The Life Sciences AI Handbook: AI for Biomedical Discovery, Biotechnology, and Translational Research

Name: The Life Sciences AI Handbook
Author: Bryan Tegomoh

Tegomoh, Bryan; [Bryan Tegomoh, MD, MPH](https://bryantegomoh.com/)

Small Molecule Generation and ADMET

Author

Bryan Tegomoh, MD, MPH

Published

May 24, 2026

Small molecule AI sits between chemical imagination and experimental attrition. Generating structures is easy compared with generating useful, selective, soluble, safe, and synthesizable compounds.

Learning Objectives

Read molecular generation claims through medicinal chemistry constraints.
Use ADMET and physical plausibility checks early.
Distinguish benchmark gains from lead optimization value.

TL;DR

The useful output is not a molecule that looks novel. The useful output is a prioritized set of compounds with rationale, feasibility, assay plan, and acceptable risk across potency, selectivity, ADMET, and chemistry.

Introduction

MoleculeNet remains a reference point for molecular machine learning benchmarks (Wu et al., 2018). ChEMBL and PubChem provide major public chemical and bioactivity resources (Zdrazil et al., 2024; Kim et al., 2023). PoseBusters shows why geometric or score-based docking success requires physical plausibility checks (Buttenschoen et al., 2024).

Demonstrated

Demonstrated capability includes property prediction, virtual screening support, molecular representation learning, and generative chemistry under constraints. MoleculeNet demonstrated standardized benchmark tasks for molecular machine learning (Wu et al., 2018). PoseBusters demonstrated that AI docking methods need validity checks beyond RMSD (Buttenschoen et al., 2024).

Evidence Anchor	What It Supports	Practical Constraint
MoleculeNet	Benchmarking molecular property models	Benchmark datasets are not medicinal chemistry programs
ChEMBL and PubChem	Chemical and bioactivity data sources	Assay context and duplicates require curation
PoseBusters	Docking plausibility checks	Physical validity matters beside RMSD

Theoretical

Theoretical capability includes multi-objective compound design that jointly optimizes potency, selectivity, solubility, permeability, metabolism, toxicity, and synthetic route. Current workflows approximate this with staged filters and expert review.

Beyond Current Capabilities

Beyond current capabilities includes one-shot generation of clinical candidates from target name alone. Biology, chemistry, formulation, toxicology, and clinical pharmacology remain program-level work.

Practice Notes

Use scaffold splits and time splits for virtual screening evaluation.
Review synthetic feasibility before celebrating novelty.
Track assay provenance and units when merging activity data.
Keep medicinal chemistry review inside the loop for design decisions.