Loading...
Loading...
The paper "Deep physical neural networks trained with backpropagation" introduces a novel approach to train physical neural networks using a hybrid in situ–in silico algorithm, termed physics-aware training. This method applies backpropagation, the de facto training method for large-scale neural networks, to controllable physical systems. The approach enables the training of deep physical neural networks made from layers of controllable physical systems, even when the physical layers lack any mathematical isomorphism to conventional artificial neural network layers.
The study demonstrates the universality of this approach by training diverse physical neural networks based on optics, mechanics, and electronics to experimentally perform audio and image classification tasks. Physics-aware training combines the scalability of backpropagation with the automatic mitigation of imperfections and noise achievable with in situ algorithms. The authors argue that physical neural networks have the potential to perform machine learning faster and more energy-efficiently than conventional electronic processors, and more broadly, can endow physical systems with automatically designed physical functionalities, such as those needed in robotics, materials, and smart sensors.
Physics-Aware Training (PAT) is a gradient descent algorithm that allows backpropagation through any physical system for which a digital model can be trained. The process begins by sending inputs and parameters into the physical system, which then propagate through the system. Next, the outputs of the system are compared to the intended outputs, and the gradient on the parameters is calculated by the differentiable digital model. With this gradient, the parameters can then be updated. PAT simplifies the training of differentiable digital models, facilitating the optimization of control parameters in physical neural networks.
PAT allows us to execute the backpropagation algorithm efficiently and accurately on any sequence of physical input–output transformations. We demonstrate the universality of this approach by experimentally performing image classification using three distinct systems: the multimode mechanical oscillations of a driven metal plate, the analogue dynamics of a nonlinear electronicoscillator and ultrafast optical second-harmonic generation (SHG).
In the paper "Deep physical neural networks trained with backpropagation," one of the examples provided is an optical Physical Neural Network (PNN) based on the propagation of broadband optical pulses in quadratic nonlinear optical media, specifically, ultrafast second harmonic generation (SHG).
To explain the process, let's consider a task where the PNN is designed to classify spoken vowels based on their formant frequencies. In this task, the machine learning device must learn to predict spoken vowels from 12-dimensional input data vectors of formant frequencies extracted from audio recordings.
The inputs to the system are encoded into the spectrum of a laser pulse using a pulse shaper. A portion of the pulse’s spectrum is assigned for trainable parameters. The result of the physical computation is obtained from the spectrum of the frequency-doubled pulse.
To create a deep PNN, the output of one SHG process is used as the input to a second SHG process, which has its own independent trainable parameters. This cascading is repeated several times, resulting in a multilayer PNN with multiple trainable physical layers.
The PNN's parameters are trained using the physics-aware training (PAT) algorithm, which performs the backpropagation algorithm for stochastic gradient descent directly on any sequence of physical input-output transformations, such as the sequence of nonlinear optical transformations in this example.
The advantage of PAT comes from the fact that the forward pass is executed by the actual physical hardware, rather than by a simulation. This is more energy-efficient because it avoids the computational overhead of a full simulation while still enabling a physical system to learn and adapt. It also enables the system to better handle real-world imperfections and noise because the training is performed on the actual hardware that will be used for inference.
When trained using PAT, this SHG-PNN was able to perform the classification of vowels to 93% accuracy.
In the optical example from the paper, the input to the Physical Neural Network (PNN) is a 12-dimensional vector of formant frequencies extracted from audio recordings of spoken vowels. These formant frequencies are encoded into the spectrum of a laser pulse, which is then sent through a sequence of second harmonic generation (SHG) processes. Each SHG process is a layer in the PNN and has its own set of trainable parameters, which are used to control the transformation implemented by that layer.
The output of the PNN is a spectrum obtained from the frequency-doubled pulse after it has passed through all the SHG processes. This spectrum is divided into bins, and the vowel prediction is determined by taking the maximum value of 7 spectral bins in the measured ultraviolet spectrum.
Physics-aware training (PAT) is used to train the parameters of the PNN using this input and output data. PAT is a gradient descent algorithm that can be used to perform backpropagation on any sequence of physical input-output transformations.
Here's how it works:
The key advantage of PAT is that it performs the forward pass (steps 1-3) on the actual physical hardware, rather than a digital simulation. This can make the training process more efficient and robust to real-world noise and imperfections.
To wrap this up with a concrete example, let's say that we have a spoken vowel, and the formant frequencies extracted from it are encoded into a laser pulse and sent through the network. The network produces an output spectrum, which is used to generate a prediction for the vowel. If the correct vowel was 'a', but the network predicted 'e', this would produce an error. The backpropagation algorithm would then calculate how the parameters should be adjusted to reduce this error, and the parameters would be updated accordingly. This process is repeated for all the examples in the training data until the network can accurately classify the vowels.
Note that the intended output is not directly a particular form of spectrum, but a classification label for the vowel (in the above example the label would be 'a'). The error is not calculated based on the raw spectrum itself, but based on the predictions derived from it.
Wright, L. G., Onodera, T., Stein, M. M., Wang, T., Schachter, D. T., Hu, Z., & McMahon, P. L. (2022). Deep physical neural networks trained with backpropagation. Nature, 601(7894), 549-555. https://doi.org/10.1038/s41586-021-04223-6