Research
Successful control of nuclear fusion plasma in a tokamak with deep reinforcement learning
To solve the global energy crisis, researchers have long sought a source of clean, unlimited energy. Nuclear fusion, the reaction that powers the stars of the universe, is one candidate. By crushing and fusing hydrogen, a common element in seawater, the powerful process releases vast amounts of energy. Here on earth, one way scientists have recreated these extreme conditions is by using a tokamak, a doughnut-shaped vacuum surrounded by magnetic coils, used to contain a hydrogen plasma that is hotter than the Sun’s core. However, the plasmas in these machines are inherently unstable, making maintaining the process required for nuclear fusion a complex challenge. For example, a control system must coordinate the tokamak’s many magnetic coils and adjust the voltage on them thousands of times per second to ensure that the plasma never touches the walls of the vessel, which would result in heat loss and possibly damage. To help solve this problem, and as part of DeepMind’s mission to advance science, we’ve partnered the Swiss Plasma Center in the EPFL to develop the first deep reinforcement learning (RL) system to autonomously discover how to control these coils and successfully contain the plasma in a tokamak, opening new avenues to advance nuclear fusion research.
In a paper published today in Nature, we describe how we can successfully control nuclear fusion plasma by building and running controllers at the Variable Configuration Tokamak (TCV) in Lausanne, Switzerland. Using a learning architecture that combines deep RL and a simulated environment, we created controllers that can hold the plasma steady and be used to precisely sculpt it into different shapes. This “plasma sculpture” shows that the RL system has successfully controlled superheated matter and – importantly – allows scientists to investigate how plasma reacts under different conditions, improving our understanding of fusion reactors.
Over the past two years, DeepMind has demonstrated the potential of artificial intelligence to accelerate scientific progress and unlock entirely new avenues of research in biology, chemistry, mathematics, and now physics.
Demis Hassambis, Co-Founder and CEO of DeepMind
This work is another powerful example of how the machine learning and expert communities can come together to tackle grand challenges and accelerate scientific discovery. Our team is hard at work applying this approach to fields as diverse as quantum chemistry, pure mathematics, materials design, weather forecasting, and more, to solve fundamental problems and ensure that AI benefits humanity.
Photos of the Tokamak Variable Configuration (TCV) at EPFL from outside (left, credit: SPC/EPFL) and inside (right, credit: Alain Herzog / EPFL) and a 3D model of the TCV with vessels and control coils (center, credit: DeepMind and SPC/EPFL)
Learning when data is hard to come by
Nuclear fusion research is currently limited by the ability of researchers to conduct experiments. While there are dozens of active tokamaks around the world, they are expensive machines and in high demand. For example, the TCV can only sustain the plasma in a single experiment for up to three seconds, after which it needs 15 minutes to cool down and reset before the next attempt. Not only that, several research groups often share the use of the tokamak, further limiting the time available for experiments.
Given the current barriers to access to a tokamak, researchers have turned to simulators to help advance research. For example, our collaborators at EPFL have built a powerful set of simulation tools that model the dynamics of tokamaks. We were able to use these to allow our RL system to learn to control the TCV in simulation and then validate our results on the real TCV, showing that we could successfully sculpt the plasma into the desired shapes. While this is a cheaper and more convenient way of training our auditors. we still had many hurdles to overcome. For example, plasma simulators are slow and require many hours of computer time to simulate one second of real time. Furthermore, the state of TCV can change from day to day, requiring us to develop algorithmic improvements, both physical and simulated, and adapt to hardware realities.
Success by prioritizing simplicity and flexibility
Existing plasma control systems are complex and require separate controllers for each of the TCV’s 19 magnetic coils. Each controller uses algorithms to estimate the properties of the plasma in real time and adjust the voltage of the magnets accordingly. Instead, our architecture uses a single neural network to control all coils simultaneously, automatically learning which voltages are best to achieve a plasma configuration directly from sensors.
As a demonstration, we first showed that we could handle many aspects of the plasma with a single controller.
The controller trained with deep reinforcement learning directs the plasma in multiple phases of an experiment. On the left, there is an inside view of the tokamak during the experiment. On the right, you can see the reconstructed plasma shape and the target points we wanted to hit. (credit: DeepMind & SPC/EPFL)
In the video above, we see the creature on top of the TCV as our system takes control. Our controller first shapes the plasma to the desired shape, then shifts the plasma down and detaches it from the walls, suspending it in the middle of the vessel on two legs. The plasma is held still, as would be needed to measure plasma properties. Then finally the creature is driven back to the top of the vessel and safely destroyed.
We then created a series of plasma shapes that are being studied by plasma physicists for their utility in power generation. For example, we made a “snowflake” shape with several “legs” that could help reduce cooling costs by dissipating exhaust gas energy to different contact points on the vessel walls. We also demonstrated a shape close to the proposition for ITER, the next-generation tokamak under construction, as EPFL conducted experiments to predict the behavior of plasma in ITER. We even did something that had never been done before in TCV by stabilizing a “droplet” where there are two creatures inside the vessel at the same time. Our unified system was able to find controllers for all these different conditions. We simply changed the target we requested and our algorithm autonomously found a suitable controller.
We have successfully constructed a series of shapes whose properties are being studied by plasma physicists. (credit: DeepMind & SPC/EPFL)
The future of fusion and beyond
Similar to the progress we have seen in the application of artificial intelligence in other scientific fields, the successful demonstration of tokamak control demonstrates the power of artificial intelligence to accelerate and assist fusion science, and we expect increasing sophistication in the use of artificial intelligence in the future. This ability to autonomously generate controllers could be used to design new kinds of tokamaks while simultaneously designing their controllers. Our work also shows a bright future for reinforcement learning in complex machine control. It is especially exciting to look at fields where AI could augment human expertise, serving as a tool to discover new and creative approaches to hard real-world problems. We predict that reinforcement learning will be a transformative technology for industrial and scientific control applications in the coming years, with applications ranging from energy efficiency to personalized medicine.