ML Advances for Data-Scarce Drug Discovery Stages

Drug discovery pipelines are infamous for being pricey, gradual, and failure-prone, resulting in AI and machine studying changing into extra commonplace to speed up progress and enhance outcomes.

Currently, machine studying in drug discovery facilities round data-rich levels, which offer plentiful information for algorithm coaching. However, elements of the pipeline that generate much less information might additionally profit from machine studying.

Ahead of the Society for Laboratory Automation and Screening (SLAS) Conference 2026, Technology Networks spoke to Dr. Daniel Reker, an assistant professor of biomedical engineering at Duke University, about his work on pairwise molecular studying, which permits higher computational decision-making in data-scarce eventualities.

In this interview, Reker discusses how pairwise molecular studying opens up new avenues in drug discovery, together with for first-in-class drug candidates, and explores what occurs when machine studying is built-in into automated laboratories.

Katie Brighton (KB): How would you describe the position machine studying performs in fashionable drug discovery right now—and the place does it nonetheless fall brief?

Dr. Daniel Reker (DR): Machine studying is actively reshaping drug discovery throughout a number of levels of the pipeline, and we see widespread adoption from pharma and biotech in addition to curiosity from tech corporations and quite a few startups. The majority of those efforts presently deal with goal identification, lead technology, and scientific trials. While it is nonetheless too early for definitive assessments, early readouts counsel computational approaches have accelerated timelines and modestly improved success charges, which could possibly be important given how pricey, gradual, and failure-prone drug discovery is.

However, the present affect of machine studying concentrates closely on data-rich levels that leverage high-throughput screening, genomics, and large-scale scientific datasets to allow coaching and fine-tuning of complicated algorithms.

Substantial progress can nonetheless be made in addressing data-scarce drug discovery challenges like lead optimization, security, and formulation improvement. These levels depend on low-throughput experiments, comparable to complicated synthesis, materials characterization, and in vivo animal research, however they signify important determination factors that decide the destiny of drug candidates.

Innovations in novel experimental platforms and strong computational algorithms are poised to boost these choices with probably even stronger advantages to scale back price and failure charges in comparison with what we now have seen thus far, finally positioning the neighborhood to carry extra and higher therapies to sufferers.

KB: Could you clarify slightly extra about what pairwise molecular studying is?

DR: Pairwise molecular studying transforms the standard machine studying process right into a contrastive drawback the place the algorithm immediately compares two molecules slightly than evaluating each independently.

Essentially, as a substitute of asking the pc, “What is the potency of molecule A?” we remodel the query to “Which of these two molecules is more potent?” This permits combinatorial information augmentation, creating tens of millions of molecular comparisons from simply lots of to 1000’s of unique datapoints. In easy phrases, we give deep neural networks totally different views on the identical underlying information to boost coaching effectivity.

This permits us to coach cutting-edge deep studying architectures on datasets of as few as 100–1000 compounds, which is the place numerous the real-world pharmaceutical decision-making round important properties like drug security, metabolism, and pharmacokinetics occurs—these are costly to measure experimentally however important for advancing one of the best candidates. We consider pairwise studying will allow the neighborhood to unleash the predictive energy of deep neural networks for these data-scarce however high-value determination factors.

KB: What type of avenues in drug discovery does pairwise molecular studying open up?

DR: Pairwise molecular studying opens a number of thrilling avenues in drug discovery. First, it permits extra correct computational molecular optimization by immediately predicting which chemical modifications will enhance important drug properties like security, metabolism, and efficiency. This helps medicinal chemists prioritize which compounds to synthesize subsequent, saving time and assets.

Second, this pairwise augmentation method permits higher computational decision-making in data-scarce eventualities. This is especially useful for properties like drug security, metabolism, and formulations—important determination factors the place experimental information is restricted and costly to generate.

It may improve predictive efficiency on novel and difficult drug targets the place little information has been gathered thus far, thereby offering a chance for machine studying to raised help the identification of first-in-class therapies. This functionality is additional strengthened algorithmically by pairwise studying’s capacity to include bounded or incompletely characterised datapoints which are usually discarded from modeling efforts. While insufficiently characterised for direct inclusion in conventional fashions, these datapoints nonetheless present necessary views and distinction to stronger candidates.

Third, our information suggests the algorithm excels at figuring out genuinely novel molecules. By studying the affect of molecular modifications slightly than merely figuring out analogues of recognized compounds, it avoids the memorization drawback frequent in complicated algorithms and pushes the algorithm to focus studying on relationships and patterns. In our proof-of-concept information, this permits extra drastic structural modifications throughout optimization, with sturdy potential to additional improve security and efficacy of drug candidates.

KB: What are the largest positive aspects you’ve seen from combining machine studying with automated labs, and the place are the remaining bottlenecks?

DR: The greatest positive aspects from combining machine studying with automated labs that I’ve seen stem from creating actually adaptive experimental design loops. In the machine studying neighborhood, we name these “active learning workflows” to point that the predictive algorithm is immediately concerned within the information acquisition and may request probably the most informative and useful datapoints. Our work and others have proven that such “active learning” setups can probably scale back the required information for decision-making by as much as 90% and allow higher predictive fashions by immediately addressing biases within the information. These setups have helped us to establish new drug candidates utilizing fewer datapoints in addition to figuring out new nanoparticle formulations that improve the efficacy and security of medicines with better accuracy.

A serious remaining bottleneck on this deployment of such suggestions loops facilities round automation infrastructure and algorithmic robustness. Most high-throughput screening platforms are optimized for scale at the price of flexibility, for instance, counting on quickly screening pre-defined compound libraries slightly than enabling adaptive cherry-picking of particular person experiments prompt by algorithms. Additionally, a number of of the important experiments comparable to materials characterization and even in vivo research are tough to combine into these automated workflows.

We consider these suggestions cycles are most impactful in actually low-data eventualities—like early-stage initiatives with below 100 datapoints. But constructing predictive fashions and enabling them to determine which datapoint to amass subsequent stays difficult even for probably the most data-efficient computational approaches. We’re addressing this via pairwise studying strategies in addition to different new lively studying developments together with yoked studying, the place algorithms are paired to work collectively. There’s substantial room for additional innovation in automation structure and experimental design methods to maximise the affect of built-in laboratories on drug discovery.

KB: Is there something you may tease about your discuss at SLAS 2026?

DR: I’m actually excited for what guarantees to be a stimulating SLAS 2026. There can be numerous nice displays and discussions across the intersection of automation and AI in drug discovery.

For my discuss particularly, I’ll be introducing a few of these pairwise and lively studying ideas we have been growing, together with some new and unpublished developments that I believe the neighborhood will discover intriguing. One spotlight is a novel class of algorithms that truly “forget data” strategically to boost their studying—it appears counterintuitive, however we’re seeing some outstanding enhancements in how rapidly these fashions converge to raised options.

I’ll be constructing in concrete examples from our work in drug discovery and nanoparticle design to showcase the sensible potential of those algorithms. The objective is to show how adaptive machine studying can carry higher decision-making to each stage of drug improvement—from early hit identification via formulation optimization.

I’m wanting ahead to connecting with potential companions and collaborators who’re curious about deploying these approaches in their very own pipelines. The actual breakthroughs will come from getting these instruments into the arms of extra analysis groups throughout academia and business.

Sources

ML Advances for Data-Scarce Drug Discovery Stages

ByNews Center

By News Center

Related Post

Quantum computing: foundations, algorithms, and emerging applications

Harvard opens more free online courses in AI, data science, programming: Check full list and direct links

Tiny chips, bigger insights for cancer treatment

You missed

Prep talk: Youth golfers to compete in Drive, Chip and Putt finals

Quantum computing: foundations, algorithms, and emerging applications

Trump says downed F-15 crew member recovered in Iran

Colorado passes first law in the US to ban arrests based solely on these drug tests

ByNews Center

Share this:

By News Center

Related Post

You missed