Hyper-parameters of PINN and ANN.
Abstract
This chapter delves into the fascinating characteristics of physics-informed neural networks (PINNs) by outlining their fundamental principles, including their mathematical foundations and structures. PINNs are designed by incorporating governing physical equations into the loss function as constraints, which helps to ensure precise output predictions even in areas with limited or no data. This chapter presents various strategies to apply PINNs to complex systems, thereby addressing the shortcomings of conventional PINNs. Additionally, multiphysics-informed neural networks (MPINNs) are introduced, with a special emphasis on complex mechatronic systems. The effectiveness of the MPINN framework is illustrated through examples such as an electric motor and a lithium-ion battery, demonstrating accurate and efficient multidimensional predictions for mechatronic systems despite limited data availability. These applications underscore the potential of MPINNs to mitigate data scarcity challenges in various industries.
Keywords
- deep neural network
- fundamental principles
- governing equation
- mechatronic system
- physics-informed neural network
- prognosis and health management
- supervision of multiphysics
1. Introduction
Deep neural networks (DNNs) are widely used in various domains, including natural language processing [1, 2, 3] and image recognition [4, 5, 6, 7, 8]. For natural language processing, DNNs are employed in tasks like machine translation, sentiment analysis, and speech recognition, significantly enhancing the understanding and generation of human language. In the field of image recognition, DNNs are instrumental in object classification, feature detection, and image segmentation, finding applications in facial recognition systems and medical imaging diagnostics. The application of DNNs has surged in recent years across various scientific and engineering fields [9, 10, 11, 12, 13], including mechatronics, fluid dynamics, materials science, and biomedical engineering. The universal approximation theorem highlights the capability of DNNs to model any continuous function, enabling the effective estimation of complex and nonlinear behaviors in diverse systems [14]. Specifically, DNNs capture complex and nonlinear relationships between input and output through a training process that optimizes a loss function quantifying the discrepancy between predicted and actual output. Therefore, the performance of DNNs significantly depends on the volume of data available to account for system characteristics, suggesting that collecting sufficient high-quality data is crucial for ensuring robustness and generality in real-world applications. This extensive data requirement can be costly and resource-intensive, constraining the accurate prediction of system behaviors when relying on purely data-driven approaches [15]. Furthermore, DNNs are often considered “black boxes” because the decision-making processes are not easily interpretable. This lack of transparency in data-only training can be problematic in critical applications, where understanding the model’s inferences is essential for trust and validation.
To overcome these limitations, research has shifted toward hybrid approaches that combine the robustness of physics-based models with the adaptability of data-driven methods [16, 17, 18]. One notable strategy in this area involves incorporating knowledge-based features derived through feature engineering. This approach addresses domain-specific knowledge to enhance model accuracy by integrating the strengths of both fundamental physics and data-driven predictions. However, these methods still struggle with managing noisy data and outliers effectively [19]. Data noise can obscure underlying relationships between input and output, complicating the learning process for models. Outliers, on the other hand, can distort model predictions, leading to unreliable results. Meanwhile, physics-informed neural networks (PINNs) have gained attention as a powerful hybrid approach in that this method could enhance the fidelity of model predictions by incorporating physics-based regularization directly into the learning process. Specifically, PINNs minimize a loss function that combines both data-derived and physical laws to ensure robust predictions even under data scarcity or noisy situation. These networks are versatile in handling both forward and inverse problems across various time and space scales [20]. For example, Raissi et al. [21] demonstrated the application of PINNs in solving forward and inverse problems for nonlinear PDEs, showcasing their potential in accurately modeling physical systems with limited data. Another study by Tartakovsky et al. [22] applied PINNs to subsurface flow problems, achieving high accuracy in predicting hydraulic conductivity fields.
Despite these advantages, PINNs face challenges such as addressing stiff partial differential equations (PDEs) and managing multi-physics in complex systems. Issues such as spectral bias, where neural networks struggle to accurately capture high-frequency components of solutions [23], and difficulties in applying PINNs to parametric PDEs, which involve varying parameters that change the behavior of the system [24], have further been identified. These challenges can impact the accuracy and robustness of PINNs, necessitating further research and development to enhance their performance in diverse applications. Therefore, various methods have been actively suggested to overcome these challenges, including adaptively weighting each loss term [25, 26, 27, 28], domain decomposition [29, 30], improved architecture to manage interpolation and extrapolation [31, 32], and embedding input coordinate features [32, 33, 34]. These efforts extend the application of PINNs beyond solid mechanics [35], thermal dynamics [36, 37] fluid mechanics [38, 39], and electromagnetics [40], enabling practical use in multiphysics system. Furthermore, increasing insight of overcoming challenges with these hybrid approaches is driving active research to address challenges in new applications.
This chapter provides an overview of the main concepts and principles of PINNs and introduces various methods for their applications with practical examples in mechatronic systems. Specifically, the remainder of this chapter is organized as follows. Section 2 explains the fundamental principles and implementation of PINNs. Section 3 introduces the challenges and overcoming strategies in PINNs. In Section 4, the focus is on practical applications of PINNs in a complex mechatronic system. Finally, the chapter concludes with some remarks in Section 5.
2. Principles of physics-informed neural network
Physics-informed neural networks (PINNs) are designed to solve physical problems characterized by limited data availability, such as noisy measurements and time-consuming simulations [41]. These networks constrain the predicted solution to meet the laws of physics and fundamental principles to enhance predictive capabilities by overcoming a data scarcity problem. The physical laws are typically expressed through the governing equations, including parametrized and nonlinear partial differential equations (PDEs) in a generalized mathematical form [21] as
where
The PINN architecture closely resembles a conventional MLP structure because the architecture comprises an input layer, several hidden layers, and an output layer (Figure 1). However, PINNs differ from conventional MLPs due to the placement of the differentiation layer (the red circles with differential symbols in Figure 1), right next to the output layer. Each of these layers plays a distinct and crucial role in the training process of PINNs. The spatiotemporal information, coordinates
The loss function
where
where
The second term
where
To demonstrate the performance of PINN, the equation of motion of a 1-DOF mass-spring-damper system is used as a simple example. This equation is a linear ordinary differential equation (ODE) dependent on time
where
where
The PINN architecture for solving Eq. (6) consists of an input layer, multiple hidden layers, and an output layer as shown in Figure 1. The input layer has a single node, because it receives one-dimensional temporal information. The hidden layer consists of three layers, each containing 35 nodes and a hyperbolic tangent activation function. The output layer also comprises a single node to predict the displacement of a single DOF under free vibration conditions. To demonstrate the superiority of PINN, an artificial neural network (ANN) with identical architecture was constructed for comparison.
The training data points for both PINN and ANN were sampled from solutions of Eq. (6), with the damping ratio
PINN and ANN were trained on the Colab environment with an Intel Xeon CPU and a Tesla T4. Both PINN and ANN were trained for 18,000 iterations, with identical hyper-parameters to ensure consistent comparisons except learning rates (Table 1).
PINN | ANN | |
---|---|---|
Number of input nodes | 1 | 1 |
Number of hidden layers | 3 | 3 |
Number of hidden nodes | 35 | 35 |
Activation function | Tanh | Tanh |
Number of output nodes | 1 | 1 |
Learning rate | 1e−4 | 1e−3 |
Epoch | 18,000 | 18,000 |
Figure 2 shows the predicted solution by the trained PINN and ANN models, in comparison to the exact solution (the blue line in Figure 2(a) and (b)). Note that ANN (the red dashed line in Figure 2(a)) accurately predicted only the data points on which it had been trained (the green circular marker in Figure 2(a)), failing to generalize to any other data. This result demonstrates the common issue of poor generalization performance in ANNs, where sparse training data limits the model’s ability to learn the full spectrum of inherent physical properties, causing it to be biased toward only the observed data [43]. Therefore, the use of ANN in engineering and scientific field, where obtaining large amounts of training data is challenging [43], has significant limitations, suggesting the need for PINNs in scientific machine learning. Figure 2(b) confirms that the PINN accurately predicts the solution outside the training data region, outperforming the ANN with the same training data because the physical constraints embedded in the PINN guide the model predictions to conform to the underlying physical laws. This result suggests that PINN may exhibit superior generalization performance compared to ANN in real world applications where acquiring data is challenging.
3. Challenges and overcoming strategies in physics-informed neural networks
PINNs represent a significant advancement in the integration of scientific machine learning with physics, designed to model complex systems, where traditional data-driven approaches fall short. PINNs provide predictions that adhere to known physics and fundamental principles by embedding the governing equations of physical systems into neural networks. Hence, this architecture serves as an alternative to traditional data-driven methods for solving PDEs. However, a variety of challenges along with a series of promising results are being encountered as PINNs are actively researched. This section delves into these challenges and elucidates strategies to address them, thereby enhancing the robustness and versatility of PINNs across various scientific and engineering domains.
3.1 Mitigating complexity with domain decomposition
PINNs have shown remarkable abilities in addressing high-dimensional PDEs due to their superior approximation and generalization capabilities [47]. Despite this, PINNs can encounter difficulties when dealing with discontinuities in PDE solutions, often failing to accurately capture and predict sharp changes or discontinuities, which can lead to errors and reduce the accuracy of the solution [48]. To overcome these challenges, an innovative strategy known as extended PINN (XPINN) has been developed (Figure 3), which involves decomposing the computational domain into multiple subdomains [46]. This method employs localized, robust networks specifically in regions with high-gradient solutions, leveraging known prior knowledge about the solution characteristics. In particular, the computational domain can be partitioned into
Then, the final solution is obtained as follows
where the indicator function
where
The XPINN framework leverages the deployment of multiple neural networks across smaller subdomains [49]. A key advantage of XPINNs is their enhanced representation and parallelization capacity, which is essential for accurately predicting complex PDE solutions. By enabling both spatial and temporal parallelization, XPINNs can significantly reduce training costs. In each subdomain, the networks can be optimally designed with specific hyperparameters such as network depth and width of hidden layers, the number and placement of residual points, nonlinear activation functions, and optimization strategies. For subdomains with complex solution characteristics, deeper networks may be employed, whereas shallower networks can be utilized in regions with simpler, smoother solutions.
The XPINN is designed to minimize a comprehensive loss function that considers both individual subdomain losses and interface losses, effectively balancing data accuracy and adherence to physical laws across the entire system. The XPINN loss function is defined at the subdomain level, while also incorporating interface conditions to ensure continuity and coherence between subdomains. Training an XPINN involves minimizing this complete loss function, which is defined as:
where
where terms
The XPINN architecture simplifies complex PDE solutions by decomposing them into multiple simpler components, thereby reducing the complexity required to learn each part and enhancing generalization. XPINNs also provide flexibility through time-domain decomposition for all differential equations and the ability to manage arbitrarily shaped complex sub-regions. This methodology ensures robust and accurate estimation of responses and other complex phenomena in systems modeled by PINNs.
3.2 Loss balancing with adaptive weighting
Most PINNs encounter a common convergence issue due to loss function imbalance, which arises from different optimization rates of multiple loss functions. Specifically, each of these functions contributes to the overall training objective but may have vastly different scales and sensitivities. For example, one loss function might represent the governing equation in thermodynamics, while another could enforce an initial condition. The different complexity in training these loss functions can lead to discrepancies in the magnitudes of optimizing gradients during training. This discrepancy often results in a situation where optimizing one loss function significantly overshadows the others, thereby stalling the overall convergence of the network.
To overcome this challenge, adaptive weighting methods are being actively researched, including self-adaptive [26] and learning rate annealing [28]. These methods achieve convergence by adjusting the gradient differences derived from each loss function. In particular, adaptive weighting using the neural tangent kernel (NTK) [50], a widely used method, addresses the optimization imbalance caused by multiple loss functions. This strategy uses NTK to dynamically adjust adaptive weights during training, guaranteeing that all loss terms converge equitably and thereby preserving overall balance. The standard loss function of PINNs with adaptive weights is as follows:
where
where
where
3.3 Extending capabilities through deep neural operator
In real-world applications, engineers often face the challenge of predicting outcomes under diverse operating conditions that differ from the specific conditions in which the neural network was trained. However, conventional PINNs that use DNNs are typically designed to train and predict based on specific initial, boundary, or geometric conditions. These inherent characteristics necessitate independent training processes, such as retraining or transfer learning [51], to adapt to new operating conditions. This additional training is not only time-consuming but also computationally expensive, limiting its use as a surrogate model. Therefore, significant research efforts have been directed toward developing more capable neural network to address these challenges.
Deep Operator Network (DON) has become a state-of-the-art surrogate model due to its ability to approximate nonlinear operators through function-to-function mapping [31]. Hence, once DON is trained, it can predict results for new operational conditions, effectively addressing both interpolation and extrapolation tasks without additional retraining. Specifically, DON integrates two distinct neural networks (Figure 4): the branch and trunk networks. The branch network handles discretized inputs from the input functions, such as initial conditions, boundary conditions, or external forcing terms, represented as
where
where
The capability of DON to capture operators can be leveraged as a powerful surrogate model for multiphysics systems governed by a range of ODEs and PDEs. Therefore, active research has been directed toward predicting responses under a variety of conditions within the fields of elastic-plastic stress [52], heat conduction [53], and fracture mechanics [54]. Furthermore, numerous studies are dedicated to improving the capabilities of DON through architectural advancements, integrating neural networks such as encoders [55], LSTM [56], or ResUNet [52].
3.4 Improving architecture through Fourier feature embedding
PINNs often face challenges when addressing multi-scale problems. PINNs operate with a relatively limited number of data points within the computational domain. One significant challenge is spectral bias, a phenomenon where multi-layer perceptron tends to train solutions and their PDE residuals starting with the lowest frequency components. Spectral bias is an intriguing characteristic observed in the neural networks training, where these networks tend to learn simpler or lower-frequency components before capturing more complex or higher-frequency components of the target function [57]. This phenomenon can significantly influence the training dynamics and generalization abilities of neural networks.
To address the multi-scale problem challenges in PINNs, particularly spectral bias, an innovative network architecture incorporating Fourier feature embeddings has been developed [33]. This approach involves enhancing the input space using Fourier features, which helps the network represent and learn higher frequency components more effectively.
As illustrated in Figure 5(a), the methodology involves applying Fourier feature embeddings. The inputs are embedded with the Fourier feature and then fed into a conventional multi-layer perceptron. The final outputs are concatenated using a linear function. The forward pass of Fourier feature embedding is detailed as follows:
where
Time-dependent problems exhibit multi-scale behavior not only across spatial dimensions but also over temporal scales. To address these multi-scale challenges in spatial-temporal domains, an innovative multi-scale Fourier feature architecture has been proposed [59]. Specifically, the neural network’s feed-forward pass is defined as follows
where
This structured approach ensures that the PINN can manage the natural low-frequency bias of neural networks and is capable of effectively learning and predicting high-frequency phenomena within the PDEs solutions. By embedding Fourier features directly into the input layer and merging them before producing the final output, the network utilizes the rich frequency information to achieve more precise and robust solutions for PDE. This architecture significantly improves the ability of PINNs to solve complex, multi-scale problems that are typically challenging for conventional machine learning models and some numerical methods. Note that the proposed models maintain computational efficiency by avoiding the introduction of additional trainable parameters and not requiring significantly more operations for their forward or backward pass compared to conventional PINN models.
4. Case studies in engineering applications
The potential of PINNs is actively being explored for real-world engineering applications. To address the inherent multiphysics nature of most real-world systems, PINNs should be extended to account for multiphysics in the system of interest [35]. However, incorporating multiphysics into a neural network increases the complexity, leading to convergence issues due to the more challenging training process. Additionally, having a comprehensive understanding of the target system and employing suitable strategies are crucial to ensuring optimal performance. This section presents two examples, where PINNs are applied to complex mechatronics systems, specifically involving a motor and a battery, by utilizing appropriate training strategies.
4.1 Application on an electric motor for virtual sensing and PHM
This subsection introduces the novel MPINN framework for virtual sensing and prognostics and health management (PHM) of electric motors [40]. The proposed framework is designed to accurately estimate electromagnetic responses in electric motors under various operational conditions. This MPINN utilizes novel strategies to manage complexities associated with rotational dynamics, improving both the accuracy and efficiency of network’s predictions. The MPINN specifically estimates various electromagnetic fields, including magnetic vector potentials (
The proposed MPINN architecture integrates three crucial features to overcome the challenges of conventional networks. First, the MPINN utilizes various PDEs to describe the electromagnetic properties of motors [60], ensuring precise and reliable predictions even with limited data availability. These PDEs guide the neural network, providing supervision for accuracy. A notable innovation is the conversion of the motor’s rotor coordinates from rotational (based on angular speed) to Cartesian coordinates. This transformation overcomes the limitations of conventional PINNs, which have difficulty with rotational machinery due to their fixed coordinate systems. By aligning the rotor and stator coordinates into a unified Cartesian framework, the MPINN effectively resolves component representation mismatches and accurately depicts the electromagnetic fields influenced by rotor movement. Second, the architecture incorporates distinct networks for each domain, integrating estimation results by considering the interface loss between the two sperate domains [49]. This approach allows for more targeted training on specific components, significantly enhancing the accuracy of electromagnetic response estimations. Additionally, an interface loss function merges the outputs from these separate networks, reducing discrepancies and errors at the interface between two domains. Third, the MPINN guarantee convergence of neural networks during training by implementing a learning rate annealing method. This method adjusts adaptive weights to balance the gradient stiffness across different loss types, thereby minimizing the overall loss [25]. This enhancement not only improves the accuracy and reliability of predictions but also makes the neural network more robust in managing the complexities associated with different loss functions.
The proposed MPINN was validated under five conditions, as detailed in Table 2. These conditions include four typical operating scenarios of the PMSM and one fault condition involving rotor axis eccentricity, which is the most common mechanical defect [61]. The MPINN was trained using data generated from a Finite Element Model (FEM) that covers the entire domain of the PMSM. The performance of MPINN was verified by comparing torques measured from an experimental PMSM testbed. Additionally, comparisons with the FEM confirmed the accuracy of electromagnetic characteristics across the entire domain [62, 63].
Case | Description | Condition | Load | Speed [RPM] |
---|---|---|---|---|
1 | Starting,no load | Normal | — | 0 |
2 | Variational speed, no load | Normal | — | 60 |
3 | Constant speed, linearly increasing load | Normal | 0 to 0.4 Nm (linear increase) | 60 |
4 | Breaking | Normal | Brake | 60 |
5 | Starting, no load, eccentricity | Eccentricity | — | 0 |
The accuracy of the MPINN for the electric motor was validated by comparing the experimental measurements from the those estimated by the FEM and MPINN. The torque measurements are only compared because measuring electromagnetic fields directly through the experiments is difficult, and only torque can be obtained. As illustrated in Figure 6, the torques predicted by the FEM (red line) and MPINN (blue line) are compared with those obtained from the experiments (black line) across the five different operational scenarios. Both models predicted the dynamic PMSM’s torques accurately when controlling the motor driver’s three-phase currents and rotor rotational speed. The average RMSEs for the FEM and MPINN were 0.0105 Nm and 0.0166 Nm, respectively, demonstrating the accurate estimation of the PMSM’s electromagnetic responses by both models. Furthermore, these results highlight the potential of MPINN in effectively managing complex and nonlinear behaviors in data, showcasing its applicability in various scenarios.
The effectiveness of the proposed MPINN in predicting electromagnetic fields across the entire domain was demonstrated by comparing its results with those obtained from FEM. The prediction of electromagnetic fields from the FEM and MPINN is shown in Figure 7, along with the error of the proposed MPINN compared to the FEM under eccentricity fault conditions. Specifically, the RMSEs are 8.78e−5 for the magnetic vector potential, 4.86e−2 for the electric field, 5.64e−2 and 1.29e−1 for the magnetic flux density in the x and y directions, respectively, and 3.56e4 and 3.95e4 for the magnetic field in the x and y directions, respectively. These results show that the MPINN accurately estimates electromagnetic fields for the entire PMSM domain. Notably, errors in the electric field were observed across the entire domain due to a higher number of spatial training points compared to temporal ones, while errors in the magnetic flux density were confined to specific areas. This difference can be explained by the neural networks’ tendency to provide more accurate estimates where training data points are more abundant.
This study highlights the superior performance of MPINN, which integrates physical constraints, compared to a conventional ANN lacking such constraints. As illustrated in Figure 8, the comparison of RMSEs for the electromagnetic responses predicted by the MPINN and the conventional ANN against the values derived from FEM. These results indicate that the RMSEs for the MPINN are consistently lower than those for the ANN in tested scenarios. Compared to the ANN, the MPINN achieves notable reductions in average RMSEs for the estimated values: 21.64% for the magnetic vector potential, 80.09% for the electric field, 70.91% and 39.44% for the magnetic flux densities in the x and y directions, respectively, and 58.39% and 53.83% for the magnetic fields in the x and y directions, respectively. These comparisons highlight the advantages of incorporating physical regularizations into the neural networks, enhancing the accuracy and robustness of estimation results compared to purely data-driven methods.
Notably, as shown in Figure 8(b), the MPINN could accurately estimate electromagnetic responses even when operational conditions changed significantly over time. In contrast, the purely data-driven approach faced difficulties in accurately capturing these variations. This highlights the effectiveness of incorporating physical principles, such as the temporal differentiation of the electric field and the magnetic vector potential via the PDE. This sharply contrasts with the ANN, which frequently violates physical laws due to having less data available compared to the network’s capacity. Additionally, the distorted magnetic fields from asymmetric rotation pose significant challenges for the ANN when estimating electromagnetic responses under eccentric conditions. Consequently, the ANN is constrained in its ability to account for all intricate phenomena occurring in the PMSM under these operational conditions. Conversely, the MPINN, guided by the governing equation that defines the relationship between inputs and outputs, achieves robustness of estimation results. These analyses indicate that incorporating prior knowledge is crucial for PINN estimation. With the same dataset size, MPINNs enhance accuracy and ensure high reliability compared to conventional data-driven methods.
This research highlights that the proposed MPINN is a powerful and reliable solution for analyzing electromagnetic responses of PMSMs. While conventional FEM are accurate, they require substantial computational resources, and physical experiments typically yield data only for partial domains. In contrast, MPINNs offer a cost-effective alternative with rapid and accurate electromagnetic characterization capabilities. The MPINN not only addresses the limitations of conventional numerical methods by offering a mesh-free solver approach but also proves more efficient and cost-effective for characterizing electromagnetic properties in both normal and fault conditions. This makes MPINNs particularly suitable for PHM applications in electric vehicles and cyber-physical systems in production facilities, where control-oriented objectives are crucial.
4.2 Application for predicting thermal runaway of a lithium-ion battery
The integration of PINN and DON, called multiphysics-informed deep operator network (MPI-DON or MPI-DeepONet) incorporating encoders, is proposed for predicting thermal runaway (TR) under a range of heating conditions, providing a rapid and accurate surrogate model for TR prediction. This neural network uses the heating curve, representing thermal abuse conditions, as input to output the chemical concentrations of the positive electrode, negative electrode, solid electrolyte interphase (SEI), and electrolyte along with the internal temperature response of the lithium-ion battery (LIB).
MPI-DON incorporating encoders have three main features that play a crucial role in attaining accurate final stage TR predictions, organized into three phases (Figure 9(a)). First, phase 1 generates virtual data by employing FEA [64] at various heating conditions (Figure 9(a), red box). Specifically, virtual data is generated to reflect the mechanism of chemical decomposition of LIB components due to temperature-dependent chain reactions involving positive electrode, negative electrode, SEI, and electrolyte. This mechanism is governed by the energy balance equation [65], Arrhenius equation [66], and heat transfer equations. MPI-DON incorporating encoders are trained with virtual data because it is challenging and expensive to conduct experiments for all operating conditions. This strategy also facilitates MPI-DON incorporating encoders in capturing hidden physics, such as four main chemical concentrations and the internal temperature response of the LIB during the training phase (Figure 9(a), purple box). Therefore, the proposed neural network can function as a virtual sensor in practical uses by acquiring this key information through FEA. Note that virtual data generated under various heating conditions are separated into a training set and a test set, with the test set validating the capability of interpolation and extrapolation. Second, DON serves as a surrogate model for predicting TR in untrained conditions, utilizing the branch network and the trunk network for operator regression (Figure 9(b), blue dashed box). As explained in Subsection 3.3, this strategy uses the branch and trunk networks to map one function (i.e., the heating condition) to another function (i.e., the responses of the cell) to approximate the solution of a nonlinear system of equations, which outperforms traditional neural networks that have to be retrained for every unseen condition. Furthermore, encoders [55] are added to the architecture of DON to resolve the issue of restricted information exchange, where branch network and trunk network interact only in the last layer. These encoders improve the ability of the DON to capture the complex response of the TR phenomenon by smoothing the input information flow to each hidden layer. Third, incorporating a novel training strategy that supervises multiple equations governing the TR of LIBs (Figure 9(b), green dashed box), along with adaptive weights allocation based on the NTK method (Figure 9(b), pink dashed box), can enhance the robustness and accuracy of the DON with encoders during the training phase. Thus, MPI-DON incorporating encoders can achieve convergence to the global minimum, thanks to the governing equations regulating the training process. Note that the input values and output values are normalized to convert the dimensionless values, addressing the mismatch between the neural network values and the physical quantities represented by the governing equations. After completing phase 2, phase 3 predicts the hidden states of a LIB, encompassing chemical concentrations and internal temperature distribution at diverse heating conditions (Figure 9(a), green box).
The effectiveness of physics constraints is demonstrated by comparing MPI-DON incorporating encoders supervised by physics constraints to DON that relies solely on data during training. Figure 10 shows an example of predicting an interpolation task in a test set, confirming the effectiveness of physics constraints for improving accuracy. Specifically, MPI-DON, achieving an RMSE of 7.6°C in temperature predictions (Figure 10(a), red line), accurately predicts temperature evolutions over all time domains, even in regions without labeled data points. In contrast, DON, achieving an RMSE of 30.5°C in temperature predictions (Figure 10(a), blue line), shows large errors in the time domain without labeled data points when TR occurs. Remarkably, DON shows little error on the trained data points labeled 0, 100, 200, and 300 min (Figure 10(a), yellow dashed line), but significant error on the untrained data points. MPI-DON predicts responses accurately across all time domains due to physics constraints. This result suggests that purely data-driven methods may be incapable of predicting the multiphysics response of LIBs, resulting in convergence to a local minimum due to insufficient training data. Figure 10(b) compares the internal temperature distribution along the center cross-section line of the LIB at the peak point, illustrating results from multiphysics FEM (black dotted line with triangles), MPI-DON (red line), and DON (blue line). This result reveals the role of physics constraints in ensuring predictions match the multiphysics characteristics of LIBs. Specifically, the symmetrical internal temperature distributions within the LIB shown by multiphysics FEM and MPI-DON, centered around the midpoint due to isotropic conduction, contrast with the asymmetric temperature distribution produced by DON, confirming the effectiveness of physics constraints. In Figure 10(c) and (d), the chemical concentrations are compared among multiphysics FEM (black dotted lines with triangles and squares), MPI-DON (red and orange lines), and DON (blue and light blue lines). The inset figure shows that MPI-DON accurately predicts the sharp variations in chemical concentration during TR, in contrast to DON, which shows significant inaccuracies. This result confirms that MPI-DON accurately predicts TR occurrence from heating conditions, predicting key chemical concentrations and temperature accurately.
The contribution of supervision from multiphysics is highlighted by comparing predictions from MPI-DON and DON based on varying numbers of data collocation points. Data collocation intervals of 15, 20, 30, 50, 60, and 100 minutes correspond to 14,364, 10,944, 7524, 4788, 4104, and 2736 points, respectively, for training the neural networks. All other training conditions, including physics, initial, and boundary condition collocation points, remain constant. The mean and 95% confidence interval of RMSE for temperature prediction are determined for the test sets with five random initial parameters (Figure 11). Results show that MPI-DON consistently predicts TR phenomena more accurately than DON, regardless of data collocation point density, due to supervision by multiphysics characteristics. Conversely, RMSE of DON increases significantly as data become sparse, indicating a high dependence on data. The confidence interval for MPI-DON is narrower, especially with 100-minute intervals, indicating better stability even with sparse data. This observation suggests robustness of MPI-DON in industrial applications where data is often limited. While DON is cost-effective with ample data (intervals less than 30 minutes), physics constraints are generally advantageous for real-world applications.
MPI-DON incorporating encoders offers a significantly faster inference time compared to multiphysics FEM. Specifically, MPI-DON takes just 0.5 ms for TR prediction, while multiphysics FEM takes about 7 seconds, making MPI-DON 10,000 times faster. After a 720-minute training phase, MPI-DON continues to provide instant predictions, unlike multiphysics FEM, whose prediction time increases linearly with the number of heating cases. This efficiency makes MPI-DON more versatile and effective for testing LIB reliability under various conditions through Monte Carlo analysis, a task that would be time-consuming and costly with FEM. The speed of MPI-DON also suits real-time applications, such as TR management in thermal systems.
In conclusion, this approach overcomes the limitations of traditional numerical methods and data-driven techniques, offering a solution that is both computationally efficient and highly accurate. The proposed neural network offers indispensable data for sophisticated thermal battery management systems, thereby improving the reliability and safety of LIBs. This advancement propels the development of digital twins for electric transport systems through artificial intelligence transformation. Additionally, beyond addressing the TR phenomenon, this approach can be implemented in various systems, furthering the impact of artificial intelligence transformation across various domains.
5. Conclusion
This chapter delved into the potential of artificial intelligence, particularly focusing on PINN, which are adept at handling complex and nonlinear systems in real-world engineering. PINN effectively addresses challenges such as data scarcity and imbalance by integrating physics and fundamental principles during the training phase. PINN ensures predictions adhere to physical laws, even when data is sparse, by embedding governing equations as constraints within the loss function. However, incorporating physical constraints into neural networks increases the training complexity, necessitating appropriate strategies for optimal performance. Especially, strategies to address complexity become crucial when applying PINN to multiphysics systems, where multiple physics are coupled. Therefore, this chapter also introduced various strategies for applying PINN to mechatronic systems such as motor and battery. The principles of PINN summarized and the introduced framework for applying PINN to multiphysics systems will be crucial in advancing technological innovation, ensuring system reliability, and influencing the future of engineering and applied sciences.
Acknowledgments
This work was supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4094, by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) [grant number 2020R1C1C1003829], by the Korea Institute of Machinery and Materials [grant No. NK250B].
Nomenclature
deep neural network | |
deep operator network | |
finite element model | |
lithium-ion battery | |
multi-layer perceptron | |
multiphysics-informed deep operator network | |
mean squared error | |
neural tangent kernel | |
partial differential equation | |
prognostics and health management | |
physics informed neural network | |
root mean squared error | |
solid electrolyte interphase | |
thermal runaway |
Symbols
magnetic flux density | |
bias of the | |
value of electrolyte | |
value of negative electrode | |
value of positive electrode | |
value of SEI | |
final output of the neural network | |
Fourier feature mapping matrix for the i-th embedding | |
operator output | |
output of the | |
NTK matrix | |
indicator function | |
total loss function | |
number of collocation points | |
final output | |
Fourier feature mapping of input | |
residual of governing equation | |
time span | |
loss term | |
weight for each loss function | |
weight of the |
Subscripts
initial value | |
boundary condition | |
data fitting | |
initial condition | |
physics | |
subdomain | |
branch network | |
trunk network |
Superscripts
predicted solution of neural network |
References
- 1.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017; 30 . DOI: 10.48550/arXiv.1706.03762 - 2.
Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805. 2018. DOI: 10.48550/arXiv.1810.04805 - 3.
Lei S, Yi W, Ying C, Ruibin W. Review of attention mechanism in natural language processing. Data Analysis and Knowledge Discovery. 2020; 4 :1-14. DOI: 10.11925/infotech.2096-3467.2019.1317 - 4.
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Communications of the ACM. 2012; 60 :84-90. DOI: 10.1145/3065386 - 5.
Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788 - 6.
Yin Y, Li H, Fu W. Faster-YOLO: An accurate and faster object detection method. Digital Signal Processing. 2020; 102 :102756. DOI: 10.1016/j.dsp.2020.102756 - 7.
Kim D, Kim S, Jeong S, Ham JW, Son S, Moon J, et al. Rotational multipyramid network with bounding-box transformation for object detection. International Journal of Intelligent Systems. 2021; 36 :5307-5338. DOI: 10.1002/int.22513 - 8.
Moon J, Jeon M, Jeong S, RoMP-transformer K-YO. Rotational bounding box with multi-level feature pyramid transformer for object detection. Pattern Recognition. 2024; 147 :110067. DOI: 10.1016/j.patcog.2023.110067 - 9.
Zhang S, Zhang S, Wang B, Habetler TG. Deep learning algorithms for bearing fault diagnostics—A comprehensive review. IEEE Access. 2020; 8 :29857-29881. DOI: 10.1109/ACCESS.2020.2972859 - 10.
Çınar ZM, Abdussalam Nuhu A, Zeeshan Q, Korhan O, Asmael M, Safaei B. Machine learning in predictive maintenance towards sustainable smart manufacturing in industry 4.0. Sustainability. 2020; 12 :8211. DOI: 10.3390/su12198211 - 11.
Jeong Y. Digitalization in production logistics: How AI, digital twins, and simulation are driving the shift from model-based to data-driven approaches. International Journal of Precision Engineering and Manufacturing-Smart Technology. 2023; 1 :187-200. DOI: 10.57062/ijpem-st.2023.0052 - 12.
Kim M, Son S, Oh K-Y. Margin-maximized hyperspace for fault detection and prediction: A case study with an elevator door. IEEE Access. 2023; 11 :128580-128595. DOI: 10.1109/ACCESS.2023.3330137 - 13.
Son S, Oh K-Y. Integrated framework for estimating remaining useful lifetime through a deep neural network. Applied Soft Computing. 2022; 122 :108879. DOI: 10.1016/j.asoc.2022.108879 - 14.
Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Networks. 1989; 2 :359-366. DOI: 10.1016/0893-6080(89)90020-8 - 15.
Huang Z, Shen Y, Li J, Fey M, Brecher C. A survey on AI-driven digital twins in industry 4.0: Smart manufacturing and advanced robotics. Sensors. 2021; 21 :6340. DOI: 10.3390/s21196340 - 16.
Wang J, Li Y, Gao RX, Zhang F. Hybrid physics-based and data-driven models for smart manufacturing: Modelling, simulation, and explainability. Journal of Manufacturing Systems. 2022; 63 :381-391. DOI: 10.1016/j.jmsy.2022.04.004 - 17.
Rai R, Sahu CK. Driven by data or derived through physics? A review of hybrid physics guided machine learning techniques with cyber-physical system (cps) focus. IEEE Access. 2020; 8 :71050-71073. DOI: 10.1109/ACCESS.2020.2987324 - 18.
Chen H, Lou S, Lv C. Hybrid physics-data-driven online modelling: Framework, methodology and application to electric vehicles. Mechanical Systems and Signal Processing. 2023; 185 :109791. DOI: 10.1016/j.ymssp.2022.109791 - 19.
Son S, Jeong S, Kwak E, Kim J-H, Oh K-Y. Integrated framework for SOH estimation of lithium-ion batteries using multiphysics features. Energy. 2022; 238 :121712. DOI: 10.1016/j.energy.2021.121712 - 20.
Kumar K, Choi Y. Accelerating particle and fluid simulations with differentiable graph networks for solving forward and inverse problems. Proceedings of the SC'23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis. 2023. pp. 60-65 - 21.
Raissi M, Perdikaris P, Karniadakis GE. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics. 2019; 378 :686-707. DOI: 10.1016/j.jcp.2018.10.045 - 22.
Tartakovsky AM, Marrero CO, Perdikaris P, Tartakovsky GD, Barajas-Solano D. Physics-informed deep neural networks for learning parameters and constitutive relationships in subsurface flow problems. Water Resources Research. 2020; 56 :e2019WR026731. DOI: 10.1029/2019WR026731 - 23.
Cao Y, Fang Z, Wu Y, Zhou D-X, Gu Q. Towards Understanding the Spectral Bias of Deep Learning. arXiv preprint arXiv:1912.01198. 2019. DOI: 10.48550/arXiv.1912.01198 - 24.
Wang S, Sankaran S, Perdikaris P. Respecting Causality is All You Need for Training Physics-Informed Neural Networks. arXiv preprint arXiv:2203.07404. 2022. DOI: 10.48550/arXiv.2203.07404 - 25.
Bischof R, Kraus M. Multi-Objective Loss Balancing for Physics-Informed Deep Learning. arXiv preprint arXiv:2110.09813. 2021. DOI: 10.48550/arXiv.2110.09813 - 26.
McClenny LD, Braga-Neto UM. Self-adaptive physics-informed neural networks. Journal of Computational Physics. 2023; 474 :111722. DOI: 10.1016/j.jcp.2022.111722 - 27.
Krishnapriyan A, Gholami A, Zhe S, Kirby R, Mahoney MW. Characterizing possible failure modes in physics-informed neural networks. Advances in Neural Information Processing Systems. 2021; 34 :26548-26560. DOI: 10.48550/arXiv.2109.01050 - 28.
Wang S, Teng Y, Perdikaris P. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing. 2021; 43 :A3055-A3081. DOI: 10.1137/20M1318043 - 29.
Jagtap AD, Kharazmi E, Karniadakis GE. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems. Computer Methods in Applied Mechanics and Engineering. 2020; 365 :113028. DOI: 10.1016/j.cma.2020.113028 - 30.
Kharazmi E, Zhang Z, Karniadakis GE. hp-VPINNs: Variational physics-informed neural networks with domain decomposition. Computer Methods in Applied Mechanics and Engineering. 2021; 374 :113547. DOI: 10.1016/j.cma.2020.113547 - 31.
Lu L, Jin P, Pang G, Zhang Z, Karniadakis GE. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence. 2021; 3 :218-229. DOI: 10.1038/s42256-021-00302-5 - 32.
Li Z, Kovachki N, Azizzadenesheli K, Liu B, Bhattacharya K, Stuart A, et al. Fourier Neural Operator for Parametric Partial Differential Equations. arXiv preprint arXiv:2010.08895. 2020. DOI: 10.48550/arXiv.2010.08895 - 33.
Tancik M, Srinivasan P, Mildenhall B, Fridovich-Keil S, Raghavan N, Singhal U, et al. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems. 2020; 33 :7537-7547. DOI: 10.48550/arXiv.2006.10739 - 34.
Basri R, Galun M, Geifman A, Jacobs D, Kasten Y, Kritchman S. Frequency bias in neural networks for input of non-uniform density. International Conference on Machine Learning. PMLR; 2020. pp. 685-694 - 35.
Niaki SA, Haghighat E, Campbell T, Poursartip A, Vaziri R. Physics-informed neural network for modelling the thermochemical curing process of composite-tool systems during manufacture. Computer Methods in Applied Mechanics and Engineering. 2021; 384 :113959. DOI: 10.1016/j.cma.2021.113959 - 36.
Cai S, Wang Z, Wang S, Perdikaris P, Karniadakis GE. Physics-informed neural networks for heat transfer problems. Journal of Heat Transfer. 2021; 143 :060801. DOI: 10.1115/1.4050542 - 37.
Jeong J, Kwak E, Kim J-H, Oh K-Y. Prediction of thermal runaway for a lithium-ion battery through multiphysics-informed DeepONet with virtual data. eTransportation. 2024; 21 :100337. DOI: 10.1016/j.etran.2024.100337 - 38.
Mao Z, Jagtap AD, Karniadakis GE. Physics-informed neural networks for high-speed flows. Computer Methods in Applied Mechanics and Engineering. 2020; 360 :112789. DOI: 10.1016/j.cma.2019.112789 - 39.
Cai S, Mao Z, Wang Z, Yin M, Karniadakis GE. Physics-informed neural networks (PINNs) for fluid mechanics: A review. Acta Mechanica Sinica. 2021; 37 :1727-1738. DOI: 10.1007/s10409-021-01148-1 - 40.
Son S, Lee H, Jeong D, Oh K-Y, Sun KH. A novel physics-informed neural network for modeling electromagnetism of a permanent magnet synchronous motor. Advanced Engineering Informatics. 2023; 57 :102035. DOI: 10.1016/j.aei.2023.102035 - 41.
Kollmannsberger S, D’Angella D, Jokeit M, Herrmann L. Deep Learning in Computational Mechanics. Springer International Publishing; 2021. DOI: 10.1007/978-3-030-76587-3 - 42.
Diao Y, Yang J, Zhang Y, Zhang D, Du Y. Solving multi-material problems in solid mechanics using physics-informed neural networks based on domain decomposition technology. Computer Methods in Applied Mechanics and Engineering. 2023; 413 :116120. DOI: 10.1016/j.cma.2023.116120 - 43.
Faroughi SA, Pawar NM, Fernandes C, Raissi M, Das S, Kalantari NK, et al. Physics-guided, physics-informed, and physics-encoded neural networks and operators in scientific computing: Fluid and solid mechanics. Journal of Computing and Information Science in Engineering. 2024; 24 :040802. DOI: 10.1115/1.4064449 - 44.
Zhang E, Dao M, Karniadakis GE, Suresh S. Analyses of internal structures and defects in materials using physics-informed neural networks. Science Advances. 2022; 8 :eabk0644. DOI: 10.1126/sciadv.abk0644 - 45.
Lou Q, Meng X, Karniadakis GE. Physics-informed neural networks for solving forward and inverse flow problems via the Boltzmann-BGK formulation. Journal of Computational Physics. 2021; 447 :110676. DOI: 10.1016/j.jcp.2021.110676 - 46.
Jagtap AD, Karniadakis GE. Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. Communications in Computational Physics. 2020; 28 :5. DOI: 10.4208/cicp.oa-2020-0164 - 47.
Raissi M. Forward–backward stochastic neural networks: Deep learning of high-dimensional partial differential equations. Peter Carr Gedenkschrift: Research Advances in Mathematical Finance. World Scientific; 2024. pp. 637-655 - 48.
Sirignano J, Spiliopoulos K. DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics. 2018; 375 :1339-1364. DOI: 10.1016/j.jcp.2018.08.029 - 49.
Shukla K, Jagtap AD, Karniadakis GE. Parallel physics-informed neural networks via domain decomposition. Journal of Computational Physics. 2021; 447 :110683. DOI: 10.1016/j.jcp.2021.110683 - 50.
Wang S, Yu X, Perdikaris P. When and why PINNs fail to train: A neural tangent kernel perspective. Journal of Computational Physics. 2022; 449 :110768. DOI: 10.1016/j.jcp.2021.110768 - 51.
Guan H, Dong J, Lee W-N. Towards Real-time Training of Physics-informed Neural Networks: Applications in Ultrafast Ultrasound Blood Flow Imaging. arXiv preprint arXiv:2309.04755. 2023. DOI: 10.48550/arXiv.2309.04755 - 52.
He J, Koric S, Kushwaha S, Park J, Abueidda D, Jasiuk I. Novel DeepONet architecture to predict stresses in elastoplastic structures with variable complex geometries and loads. Computer Methods in Applied Mechanics and Engineering. 2023; 415 :116277. DOI: 10.1016/j.cma.2023.116277 - 53.
Koric S, Abueidda DW. Data-driven and physics-informed deep learning operators for solution of heat conduction equation with parametric heat source. International Journal of Heat and Mass Transfer. 2023; 203 :123809. DOI: 10.1016/j.ijheatmasstransfer.2022.123809 - 54.
Goswami S, Yin M, Yu Y, Karniadakis GE. A physics-informed variational DeepONet for predicting crack path in quasi-brittle materials. Computer Methods in Applied Mechanics and Engineering. 2022; 391 :114587. DOI: 10.1016/j.cma.2022.114587 - 55.
Wang S, Wang H, Perdikaris P. Improved architectures and training algorithms for deep operator networks. Journal of Scientific Computing. 2022; 92 :35. DOI: 10.1007/s10915-022-01881-0 - 56.
He J, Kushwaha S, Park J, Koric S, Abueidda D, Jasiuk I. Sequential deep operator networks (S-DeepONet) for predicting full-field solutions under time-dependent loads. Engineering Applications of Artificial Intelligence. 2024; 127 :107258. DOI: 10.1016/j.engappai.2023.107258 - 57.
Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, et al. On the spectral bias of neural networks. International Conference on Machine Learning. PMLR; 2019. pp. 5301-5310 - 58.
Wang S, Wang H, Perdikaris P. On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering. 2021; 384 :113938. DOI: 10.1016/j.cma.2021.113938 - 59.
Hesthaven JS, Gottlieb S, Gottlieb D. Spectral Methods for Time-Dependent Problems. Cambridge University Press; 2007; 21 . DOI: 10.1017/CBO9780511618352 - 60.
Jabbar M, Liu Z, Dong J. Time-stepping finite-element analysis for the dynamic performance of a permanent magnet synchronous motor. IEEE Transactions on Magnetics. 2003; 39 :2621-2623. DOI: 10.1109/TMAG.2003.816500 - 61.
Gherabi Z, Toumi D, Benouzza N, Boudinar AH, Koura MB. Discrimination between demagnetization and eccentricity faults in PMSMs using real and imaginary components of stator current spectral analysis. Journal of Power Electronics. 2021; 21 :153-163. DOI: 10.1007/s43236-020-00169-6 - 62.
Hyunseung Lee SS, Jeong D, Sun KH, Jeon BC, Ki-Yong O. High-fidelity multiphysics model of a permanent magnet synchronous motor for fault data generation. Journal of Sound and Vibration. 2024; 589 :118573. DOI: 10.1016/j.jsv.2024.118573 - 63.
Lee H, Son S, Jeong D, Sun KH, Jeon BC, Oh K-Y. A Finite Element Model of an Electric Motor with an Unbalanced Rotor for Vibration Data Generation. International Journal of Precision Engineering and Manufacturing-Smart Technology. 2024; 2 (1):47-56. DOI: 10.57062/ijpem-st.2023.0122 - 64.
Kwak E, Kim J-H, Jeong J, Oh K-Y. Multiphysics-informed thermal runaway model for estimating the pressure evolution induced by the gas formation in a lithium-ion battery. Applied Thermal Engineering. 2024; 246 :122941. DOI: 10.1016/j.applthermaleng.2024.122941 - 65.
Guo G, Long B, Cheng B, Zhou S, Xu P, Cao B. Three-dimensional thermal finite element modeling of lithium-ion battery in thermal abuse application. Journal of Power Sources. 2010; 195 :2393-2398. DOI: 10.1016/j.jpowsour.2009.10.090 - 66.
Liu B, Jia Y, Yuan C, Wang L, Gao X, Yin S, et al. Safety issues and mechanisms of lithium-ion battery cell upon mechanical abusive loading: A review. Energy Storage Materials. 2020; 24 :85-112. DOI: 10.1016/j.ensm.2019.06.036