Drug discovery can be likened to playing a game of molecular Tetris, where chemists fit together atoms, experimenting with different configurations until they create a promising new medicine. Traditionally, this process is both time-consuming and costly.
In a groundbreaking study, researchers harnessed machine learning to develop a more efficient prediction system. This approach aims to expedite drug development while significantly reducing expenses.
“We often turn to advanced, physics-based computational chemistry tools to understand novel reactions. However, these tools can be prohibitively expensive when predicting outcomes for numerous potential new molecules,” explained Simone Gallarati, the study’s co-lead author and a joint postdoctoral researcher at the University of Utah and the University of California, Los Angeles. “Our goal was to create statistical models that were sophisticated enough to provide accurate predictions on untested reactions, yet cost-effective.”

Molecules can exist in mirror-image forms, a property known as “handedness.” The distinction between left- and right-handed variants is critical, as one form may be therapeutic while the other could be harmful. Chemists must find the right combination of tools—catalysts, ligands, and substrates—to construct the desired version.
This new system serves as a sophisticated filter capable of analyzing tens of thousands of chemical structures. It predicts how various components will assemble, effectively favoring one “hand” of the molecule over the other. This workflow provides an economical method to convert the components of a reaction into numerical data for machine learning analysis.
Remarkably, the model demonstrated the ability to reliably forecast component behavior with minimal input, significantly reducing the time, energy, and costs typically associated with laboratory testing.
“Most AI tools require extensive datasets for model training, which poses a challenge in chemistry due to the high costs and time involved in obtaining quality data from experiments,” stated Matthew Sigman, chemist at the University of Utah and coauthor of the study. “The most impressive aspect of this tool is its ability to allow researchers to gather smaller data sets, create reasonably effective models, and generate accurate predictions for known reactions, while also applying these predictions to new reactions it hasn’t encountered previously.”
The findings were published in an accelerated preview in the journal Nature on February 11, 2026.
High-tech filter

The researchers focused their workflow on asymmetric cross-coupling reactions, a powerful arsenal for drug development. These reactions join two carbon-based molecular fragments using a metal catalyst to create more complex compounds. They’re termed asymmetric because they are designed to favor one handed version of the molecule. Typically, chemists produce both forms, which would result in a 50/50 outcome without guidance. Asymmetric reactions, however, can yield as much as 95% of the desired form with only 5% of the unwanted mirror image.
Asymmetric cross-coupling reactions commonly require at least three components: a metal, a ligand, and substrates. The metal catalyst facilitates the joining of carbon-based molecules. Meanwhile, the ligand binds to the metal and controls which side of the molecule reacts, thereby influencing its three-dimensional structure. The ligand plays a crucial role in determining the handedness of a molecule.
To train their model, Gallarati and the research team identified four academic papers on asymmetric reactions, including past work by coauthor Abigail Doyle and Sigman, that employed nickel-based catalysts with varying ligands. These findings provided the sole training data for their workflow. Subsequently, they queried the system to predict the outcomes of hypothetical components not included in the training set. By introducing progressively more difficult tasks, they compelled the algorithm to make predictions on materials that diverged increasingly from the original training data. The team evaluated the predictive success in the Doyle lab, led by Erin Bucci, who is also a co-lead author and doctoral student at UCLA.

“As a chemist working in the lab, this tool is invaluable for minimizing the time spent on experiments,” Bucci remarked. “For instance, instead of conducting 50 to 60 reactions, we now only need to run 5 to 10, potentially saving weeks or even months. Every component tested in the lab requires either purchasing or synthesizing from scratch, so this tool dramatically reduces my typical material costs.”
While the authors evaluated the tool within the context of new nickel-based reactions, the workflow has broader applications and can enhance our understanding of chemistry itself.
“One of the benefits of this workflow is that it isn’t a black box,” noted Abigail Doyle, chemist at UCLA and coauthor of the study. “We can derive insights about the chemistry from the predictions, even if they’re not entirely accurate. Our chemistry expertise helps us uncover information that we wouldn’t have accessed without this tool.”
The pharmaceutical industry stands to gain immensely from such a tool, Sigman added. For example, if a company needs to produce a significant quantity of a compound for a clinical trial and intends to apply a reaction already documented in literature but not specifically for their target compound, this tool could prove invaluable.
“This is where our tool could have a huge impact,” he explained. “Optimizing both reactions and the associated costs is critical when developing drugs. This streamlined process could drastically influence the transition of a molecule from phase one to phase two.”