During a chemical reaction, molecules gain energy until they reach what is known as a transition state—a point of no return from which the reaction must proceed. This state is so fleeting that it is almost impossible to observe it experimentally.
The structures of these transition states can be calculated using techniques based on quantum chemistry, but this process is extremely time-consuming. A team of MIT researchers has now developed an alternative approach, based on machine learning, that can calculate these structures much faster — within seconds.
Their new model could be used to help chemists design new reactions and catalysts to create useful products such as fuels or medicines, or to model natural chemical reactions such as those that could have contributed to the evolution of life on Earth.
“Knowing what the structure of the transition state is is really important as a starting point for thinking about designing catalysts or understanding how natural systems carry out certain transformations,” says Heather Kulik, associate professor of chemistry and chemical engineering at MIT and the senior author of the study. .
Chenru Duan PhD ’22 is its lead author a paper describing the projectappearing today at Nature Computational Science. Cornell University graduate student Yuanqi Du and MIT graduate student Haojun Jia are also authors of the paper.
Fleeting transitions
For any chemical reaction to occur, it must pass through a transition state, which occurs when the energy threshold required for the reaction to proceed is reached. The probability that any chemical reaction will occur is determined in part by how likely the transition state is to form.
“The transition state helps determine the likelihood that a chemical transformation will occur. “If we have a lot of stuff we don’t want, like carbon dioxide, and we’d like to turn it into a useful fuel like methanol, the transition state and how favorable that is determines how likely we are to get from reactant to product,” says Kulik.
Chemists can calculate transition states using a quantum chemistry method known as density functional theory. However, this method requires a huge amount of computing power, and it can take many hours or even days to calculate just one transition state.
Recently, some researchers have tried to use machine learning models to discover transition state structures. However, the models developed so far require considering two reactants as a single entity in which the reactants maintain the same orientation with respect to each other. Any other possible orientations must be modeled as separate reactions, which adds to computational time.
“If the molecules of the reactants are rotated, then in principle, before and after this rotation they can undergo the same chemical reaction. But in the traditional machine learning approach, the model will see them as two different reactions. This makes machine learning training much more difficult, as well as less expensive,” says Duan.
The MIT team developed a new computational approach that allowed them to represent two reactants in any arbitrary orientation to each other, using a type of model known as a diffusion model, which can learn which types of processes are most likely to produce a particular outcome. As training data for their model, the researchers used reactant, product and transition state structures calculated using quantum computing methods for 9,000 different chemical reactions.
“Once the model learns the underlying distribution of how these three structures coexist, we can give it new reactants and products, and it will try to create a transition state structure that combines with those reactants and products,” says Duan .
The researchers tested their model on about 1,000 reactions it had never seen before, asking it to generate 40 possible solutions for each transition state. They then used a “confidence model” to predict which situations were most likely to occur. These solutions were accurate to within 0.08 angstroms (one hundred millionth of a centimeter) compared to transition state structures generated using quantum techniques. The entire computational process takes only a few seconds for each reaction.
“You can imagine that it really amounts to thinking about creating thousands of transitions in the time it would normally take you to create just a handful using the conventional method,” says Kulik.
Modeling reactions
Although the researchers trained their model primarily on reactions involving compounds with a relatively small number of atoms—up to 23 atoms for the entire system—they found that it could also make accurate predictions for reactions involving larger molecules.
“Even if you look at larger systems or enzyme-catalyzed systems, you get pretty good coverage of the different types of ways that individuals are most likely to rearrange,” says Kulik.
The researchers now plan to expand their model to incorporate other ingredients, such as catalysts, which could help them investigate how much a particular catalyst would speed up a reaction. This could be useful for developing new processes for the production of pharmaceuticals, fuels or other useful compounds, especially when the synthesis involves several chemical steps.
“Traditionally all these calculations are done with quantum chemistry, and now we are able to replace the quantum chemistry part with this fast production model,” says Duan.
Another possible application of this kind of model is to investigate the interactions that may occur between gases found on other planets, or to model the simple reactions that may have occurred during the early evolution of life on Earth, the researchers say.
The new method represents “an important step forward in predicting chemical reactivity,” says Jan Halborg Jensen, a professor of chemistry at the University of Copenhagen, who was not involved in the research.
“Finding the transition state of a reaction and the associated barrier is the key step in predicting chemical reactivity, but also one of the most difficult tasks to automate,” he says. “This problem holds back many important fields, such as computational catalysis and reaction discovery, and this is the first paper I’ve seen that could remove this bottleneck.”
The research was funded by the US Office of Naval Research and the National Science Foundation.