Fourier transforms show how well the deep neural network learns

Fourier transforms show how well the deep neural network learns complex physics – Houssenia Writing

One of the oldest tools in computational physics — a 200-year-old mathematical technique known as Fourier analysis — can provide crucial insights into how a form of artificial intelligence called a deep neural network learns to perform tasks that are complex Physics such as climate and turbulence modelling. according to a new study.

The discovery by mechanical engineering researchers at Rice University is detailed in an open-access study published in Nexus PNAS, a sister publication of the Proceedings of the National Academy of Sciences.

“This is the first rigorous framework to explain and guide the use of deep neural networks for complex dynamic systems like climate,” said the study’s corresponding author Pedram Hassanzadeh. “This could dramatically accelerate the use of deep scientific learning in climate science and lead to much more reliable predictions of climate change. »

In the article, Hassanzadeh, Adam Subel, and Ashesh Chattopadhyay, both former students, and Yifei Guan, a postdoctoral researcher, describe their use of Fourier analysis to study a deep learning neural network that has been trained to understand the complex airflows in the to recognize atmosphere. or water in the ocean and forecast how those currents would change over time. Their analysis “not only revealed what the neural network learned, but also allowed us to connect what the network learned directly to the physics of the complex system it modeled,” Hassanzadeh said.

“Unfortunately, deep neural networks are difficult to understand and are often viewed as ‘black boxes,'” he said. “This is one of the main concerns when using deep neural networks in scientific applications. The other is generalizability: these networks cannot work for any other system than the one they were trained for. »

Hassanzadeh said the analytical framework his team presents in the paper “opens the black box, allows us to look inside to understand what networks have learned and why, and also allows us to do so with the.” physics of the learned system in relation.” .

Subel, the study’s lead author, began the research as a Rice undergraduate and is now a graduate student at New York University. He said the framework could be used in combination with transfer learning techniques to “enable generalization and ultimately increase the reliability of deep scientific learning.”

While many previous studies have attempted to show how deep learning networks learn to make predictions, Hassanzadeh said he, Subel, Guan, and Chattopadhyay decided to approach the problem from a different angle.

“Common machine learning tools for understanding neural networks have not shown much success for natural systems and engineering applications, at least in a way that results can be connected to physics,” Hassanzadeh said. “Our thought was, ‘Let’s do something different. Let’s take a general tool for studying physics and apply it to study a neural network that has learned to do physics. ”

He said Fourier analysis, first proposed in the 1820s, is a favorite technique used by physicists and mathematicians to identify frequency patterns in space and time.

“People who study physics almost always look at data in Fourier space,” he said. “It makes physics and math easier. »

For example, if someone had minute-by-minute records of outside temperatures over a period of a year, the information would be a string of 525,600 digits, a type of record that physicists call chronological. To analyze the time series in Fourier space, a researcher would trigonometrically transform each number in the series and create another set of 525,600 numbers that would contain information from the original set but would be quite different.

“Rather than seeing the temperature every minute, you would just see a few spikes,” Subel said. “One would be the 24-hour cosine, which would be the day-night cycle of highs and lows. This signal was present throughout the time series, but Fourier analysis allows you to easily spot these types of signals both temporally and spatially. »

Based on this method, scientists have developed other time-frequency analysis tools. For example, low-pass transforms filter out background noise, and high-pass filters do the opposite, allowing you to focus on the background.

Hassanzadeh’s team first performed the Fourier transform on the equation of their fully trained deep learning model. Each of the approximately 1 million model parameters acts as a multiplier that gives more or less weight to certain operations in the equation in model calculations. In an untrained model, the parameters have random values. These are adjusted and refined during training as the algorithm gradually learns to arrive at predictions that get closer and closer to the known results in the training cases. Structurally, the model parameters are grouped into approximately 40,000 five-by-five matrices, or kernels.

“When we took the Fourier transform of the equation, it told us to look at the Fourier transform of these matrices,” Hassanzadeh said. “We didn’t know that. Nobody has ever done this part before, looking at the Fourier transforms of these matrices and trying to connect them to physics.

“And when we did that, it turns out that what the neural network is learning is a combination of low-pass filters, high-pass filters and Gabor filters,” he said.

“The beauty of this is that the neural network doesn’t do magic,” Hassanzadeh said. “It doesn’t do anything crazy. A physicist or mathematician could have tried that. Of course, without the power of neural networks, we didn’t know how to properly combine these filters. But when we talk to working physicists about it, they love it. ‘Cause they say, ‘Oh! I know what those things are. The neural network has learned this. I understand.’ »

Subel said the findings have important implications for scientific learning, even suggesting that some things scientists have learned from studying machine learning in other contexts, such as classifying static images, may not apply to scientific machine learning hold true.

“We have found that some of the insights and conclusions from the machine learning literature, e.g. gained from working on commercial and medical applications, do not apply to many critical applications in science and engineering, such as B. the modeling of climate change. ‘ said Subel. . “That in itself is an important implication. »

Chattopadhyay received his Ph.D. in 2022 and is now a researcher at the Palo Alto Research Center.

The research was supported by the Office of Naval Research (N00014-20-1-2722), the National Science Foundation (2005123, 1748958), and the Schmidt Futures program. Computational resources were provided by the National Science Foundation (170020) and the National Center for Atmospheric Research (URIC0004).