With the proliferation of computationally intensive machine learning applications, such as chatbots that perform language translation in real-time, device manufacturers often incorporate specialized hardware components to quickly move and process the vast amounts of data these systems require.
Choosing the best design for these components, known as deep neural network accelerators, is challenging because they can have a huge range of design options. This difficult problem becomes even more thorny when a designer tries to add cryptographic functions to keep the data safe from attackers.
Now, MIT researchers have developed a search engine that can efficiently identify optimal designs for deep neural network accelerators that preserve data security while boosting performance.
Their search tool, known as SecureLoop, is designed to examine how adding encryption and data authentication measures will affect the performance and power usage of the accelerator chip. An engineer could use this tool to get the optimal design of an accelerator tailored to his neural network and machine learning project.
Compared to conventional programming techniques that do not consider security, SecureLoop can improve the performance of accelerator designs while keeping data protected.
Using SecureLoop could help a user improve the speed and performance of demanding AI applications, such as autonomous driving or medical image classification, while ensuring that sensitive user data remains safe from certain types of attacks.
“If you’re interested in doing a calculation where you’re going to maintain data security, the rules we used before to find the optimal design are now broken. Therefore, all of this optimization must be adjusted for this new, more complex set of constraints. And this is where [lead author] Kyungmi has done in this paper,” says Joel Emer, MIT professor of practice in computer science and electrical engineering and co-author of a paper on SecureLoop.
Emer is joined by lead author Kyungmi Lee, a graduate student in electrical engineering and computer science. Mengjia Yan, the Homer A. Burnell Professional Development Assistant Professor of Electrical Engineering and Computer Science and member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). and senior author Anantha Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. The research will be presented at the IEEE/ACM International Symposium on Microarchitecture.
“The community has passively accepted that adding cryptographic functions to an accelerator will introduce overhead. They thought it would only introduce a small deviation in the design exchange space. But, this is a misconception. In fact, cryptographic functions can significantly distort the design space of energy-efficient accelerators. Kyungmi did a fantastic job identifying this issue,” adds Yan.
Safe acceleration
A deep neural network consists of many layers of interconnected nodes that process data. Typically, the output of one layer becomes the input of the next layer. Data is grouped into units called tiles for processing and transfer between off-chip memory and the accelerator. Each layer of the neural network can have its own data array configuration.
A deep neural network accelerator is a processor with an array of computing units that parallelizes operations, such as multiplication, at each layer of the network. The accelerator program describes how data is moved and processed.
Since space on an accelerator chip is at a premium, most data is stored in off-chip memory and fetched by the accelerator when needed. But because the data is stored off-chip, it is vulnerable to an attacker who could steal information or change some values, causing the neural network to malfunction.
“As a chip manufacturer, you can’t guarantee the security of external devices or the overall operating system,” Lee explains.
Manufacturers can protect data by adding validated encryption to the accelerator. Encryption scrambles data using a secret key. The authenticator then slices the data into uniform pieces and assigns a cryptographic hash to each piece of data, which is stored with the piece of data in off-chip memory.
When the accelerator retrieves an encrypted piece of data, known as an authentication block, it uses a secret key to retrieve and verify the original data before processing it.
But the sizes of authentication blocks and data tiles don’t match, so there could be multiple tiles in one block, or one tile could be split into two blocks. The accelerator cannot arbitrarily grab a fraction of an authentication block, so it may end up grabbing extra data, which uses additional energy and slows down the computation.
In addition, the accelerator must perform the cryptographic operation on each authentication block, adding even more computational cost.
An efficient search engine
With SecureLoop, MIT researchers sought a method that could determine the fastest and most energy-efficient accelerator schedule—one that minimizes the number of times the device needs to access off-chip memory to get additional blocks of data due to encryption and authentication.
They started by enhancing an existing search engine that Emer and his colleagues had developed, called Timeloop. First, they added a model that could represent the additional computation required for encryption and authentication.
They then reformulated the search problem into a simple mathematical expression, which allows SecureLoop to find the ideal authentic block size much more efficiently than searching through all possible options.
“Depending on how this block is assigned, the amount of redundant traffic can increase or decrease. If you map the cryptographic block intelligently, then you can just get a small amount of additional data,” says Lee.
Finally, they incorporated a heuristic technique that ensures that SecureLoop determines a schedule that maximizes the performance of the entire deep neural network, not just a single layer.
In the end, the search engine outputs an acceleration program, which includes the data queuing strategy and authentication block size, that provides the best possible speed and energy efficiency for a particular neural network.
“The design space for these accelerators is huge. What Kyungmi did was come up with some very realistic ways to make this search possible so that she could find good solutions without having to exhaustively search the space,” says Emer.
When tested on a simulator, SecureLoop identified schedules that were up to 33.2 percent faster and presented 50.2 percent better energy-delay product (a metric related to energy efficiency) than other methods that did not take security into account .
The researchers also used SecureLoop to explore how the design space for accelerators changes when security is considered. They learned that allocating a little more chip area for the cryptographic engine and sacrificing on-chip memory space can lead to better performance, Lee says.
In the future, the researchers want to use SecureLoop to find accelerator designs that are resistant to side-channel attacks, which occur when an attacker has access to physical hardware. For example, an attacker could monitor a device’s power consumption pattern to obtain secret information, even if the data is encrypted. They also extend SecureLoop so that it can be applied to other kinds of computations.
This work is funded, in part, by Samsung Electronics and the Korea Foundation for Advanced Study.