Open access peer-reviewed chapter

Integrated Multiscale Modeling-Simulation (MMS) and Machine Learning (ML)-Based Design and Development of Novel Technologies, Systems, and Processes

Written By

Seçkin Karagöz

Submitted: 02 June 2023 Reviewed: 03 June 2023 Published: 07 August 2023

DOI: 10.5772/intechopen.1002381

From the Edited Volume

Simulation Modeling - Recent Advances, New Perspectives, and Applications

Abdo Abou Jaoudé

Chapter metrics overview

77 Chapter Downloads

View Full Metrics

Abstract

The development of novel technologies, systems, and processes is conventionally complemented by experimental testing. However, experimental tools for testing and examining the results are expensive, and their use is time-consuming. In this context, to accelerate the development, commercialization, utilization, and problem solutions of novel technologies, systems, and processes, the simultaneous use of computational and experimental tools such as hierarchical integrated machine learning (ML)-assisted multi-scale modeling-simulation (MMS) and experimental approaches is essential. These approaches greatly improve the entire technology development process by reducing cost and time and allow us to tackle problems that cannot be solved using theoretical or experimental methods alone. In this chapter, we describe ways in which integrated multiscale modeling-simulation and machine learning have been leveraged to facilitate the design and development of novel technologies, systems, and processes. We first provide a taxonomy of multiscale modeling-simulation and machine learning paradigms and techniques, along with a discussion of their strengths and limitations. We then provide an overview of opportunities and existing research using multiscale modeling-simulation and machine learning for the design and development of novel technologies, systems, and processes. Finally, we propose future research directions and discuss important considerations for deployment.

Keywords

  • machine learning
  • multiscale modeling
  • multiscale simulation
  • integrated MMS and ML
  • integrated MMS and ML-based design

1. Introduction

Traditionally, the development of novel technologies, systems, and processes is complemented by testing. Experimental tools for testing and examining the results are expensive, and their use is time-consuming. In recent years, thanks to the rapid growth in computational speed, it has been seen that the use of computer-aided methods in the systems’ design phase contributes greatly to the reduction of cost and time for the entire technology development process. Moreover, the simultaneous use of computational and experimental tools allows us to tackle problems that cannot be solved using theoretical or experimental methods alone. Experimental methods can improve the computational model by incorporating new data, while computational tools can use this advanced model to analyze structure, properties, and optimization by examining a wide set of possible configurations.

To accelerate the development, commercialization, utilization, and problem solutions of novel technologies, systems, and processes, hierarchical integrated theoretical approaches such as machine learning (ML)-assisted multi-scale modeling-simulation (MMS), optimization, techno-economic analysis, and experimental approaches (Figure 1) are essential.

Figure 1.

Hierarchical integrated theoretical and experimental approaches to the technology-system-process development and problem solutions.

Multiscale modeling-simulation is a rapidly growing subject of study that encompasses a variety of disciplines, including physics, biology, chemistry, mathematics, statistics, chemical engineering, mechanical engineering, and materials science [1]. The most typical objective of multi-scale modeling and simulation is to anticipate the macroscopic behavior of an engineering system process from fundamental principles by computing information at a smaller scale and passing it to a model at a larger scale by leaving out degrees of freedom as one moves from smaller to larger scales (upscaling or bottom-up approach) [2, 3]. Machine learning (ML) is becoming widely acknowledged as a promising technology in a wide range of scientific applications. ML is a potent method for integrating multimodality and multi-fidelity data and revealing correlations between interconnected phenomena presents a unique opportunity in this regard [4].

Integrated multiscale modeling-simulation (MMS) and machine learning (ML) approach allows many systems and processes to be accurately simulated and many system and process parameters to be calculated without the need for experiments before performing expensive and time-consuming experiments and tests. This strategy significantly improves the entire technological development process by saving costs and time and also assists us tackle problems that cannot be solved using traditional theoretical or experimental approaches alone. Thus, in this chapter, we describe ways in which integrated multiscale modeling-simulation and machine learning have been leveraged to facilitate the design and development of novel technologies, systems, and processes. We begin with a taxonomy of multiscale modeling-simulation and machine learning paradigms and methodologies, followed by a discussion of their strengths and limitations. Following that, we present an overview of prospects and existing research employing multiscale modeling-simulation and machine learning for the design and development of novel technologies, systems, and processes. Finally, we offer future research areas and highlight critical deployment factors.

Advertisement

2. Modeling

The development of models is driven by the necessity to comprehend and anticipate the functioning and performance of particular component(s) of the system/process or the overall system/process behavior. The study of physical processes on models is known as simulation. In the most basic form, a model reproduces the studied system (SS=Original) while retaining its physical nature and geometric similarity to the original and differing only in parameters and variables. Depending on the objective of modeling, the modeling process takes into account a number of variables while rejecting others [5]. A model can be described as the following:

MSSSOGILSSSOGTCL

where SS is the studied system (Original) and the source of information; SO is the subject/observer who requires information on the SS for study or decision-making purposes; G is(are) goal(s); I is the infrastructure employed in the process of modeling and spans a set of technologies such as T (methods, means, algorithms etc.) and C (conditions of modeling: external and internal factors affect model development and functioning) and L is the language for describing gnoseological features of model-original relationships.

To create a model that is “fit for purpose,” a wide range of tasks, knowledge, and skills need to be practiced. The steps in mathematical modeling are as follows [6, 7]: (1) Definitions: Problem(s) and models (model end-uses, model types, etc.) are identified. (2) Model conceptualization: The process of generating an initial notion through brainstorming with multiple stakeholders. It is crucial that the stakeholders comprise those who are intimately knowledgeable with the process or product, the modeling team, and an experienced modeler with the capacity to determine the veracity or relevance of stakeholder perspectives. (3) Model data requirements: In this stage of model development, several data concerns must be recognized for use in model development, model validation, and model deployment/use/reuse. (4) Model construction: Model construction is the process of expressing an abstracted description of a system so as to generate a simulation or model of its actual behavior. (5) Model simulation: Solution of the model equations. Modeling and simulation tools are used routinely to solve the set of model equations. Integrated conceptualization, modeling, and simulation systems are used by many modeling tools. (6) Model verification: Model verification jobs are distinct from model validation tasks in that they are fundamentally debug activities. Model verification is the process of comparing model implementation or equation code to the conceptual description or original model equations. It must be established whether the model accurately represents the conception. (7) Model validation: Model validation is a distinct activity from model verification, yet it is inextricably tied to it. A model that fails the verification test may have major validation issues. The following inquiry is referred to as validation: Is the model a reasonable representation of the real system? A variety of factors may play a role in answering this question. (8) Model deployment and maintenance: The process of acquiring the model for the application is only the beginning of the model’s existence. The processes originally used to build the model continue to play an ongoing role in the model’s long-term application. Several model classification approaches can be proposed for various purposes and applications.

2.1 Mathematical modeling approach classification

As depicted in Figure 2, mathematical modeling can be divided into three categories: mechanistic-white box, empirical-black box, and hybrid-gray box, and these, in turn, have subcategories [7, 8, 9, 10].

Figure 2.

Mathematical modeling approach classification.

Mechanistic-white box models provide some level of comprehension or explanation of the represented phenomena. A well-constructed mechanistic model is clear and extensible, essentially without limits. Mechanistic models are built on theories regarding how the system functions, the key components, and their interrelationships. These models make it possible to understand the input and output variables, as well as the factors involved in the modeling procedure. Mechanistic models tend to be more focused on research than on practical applications, though this is changing as our mechanistic models get more trustworthy. The evaluation of such models is crucial, albeit frequently and unavoidably subjective. Conventional mechanistic models are intricate and unfriendly.

Empirical-black box models describe a system’s behaviors using mathematical or statistical equations without scientific content, constraints, or principles. This type may be the ideal form of model to develop depending on your goals. Its construction is based exclusively on experimental data and does not explain dynamic mechanisms, therefore its process is unknown. It is difficult to estimate an unknown function from observations of its values. The general recommendation in this regard is to estimate models of varying complexity and evaluate them using validation data. A regularized fit criterion is an effective approach to limit the flexibility of certain types of models. Finding a suitably flexible parameterization model is a critical challenge. Researchers typically use algorithms to predict physiological characteristics, such as support vector machines (SVM), back-propagation neural network (BPNN), artificial neural network (ANN), deep neural network (DNN), and the combination of wide and deep neural network (WDNN).

Hybrid-gray box models are intermediate models, defined as semi-empirical or semi-mechanistic. These models are also known as gray box or hybrid models since they combine empirical and mechanistic models. Both mechanical and empirical models can be deterministic or stochastic. Determinists make specific quantitative forecasts without regard for the accompanying probability distribution. This is appropriate in many circumstances; however, it may not be adequate for highly variable quantities or processes. Stochastic models, on the other hand, include a random element as part of the model so that the predictions have a distribution. Stochastic models are difficult to create, test, and falsify.

Both the deterministic and stochastic models, on the other hand, can take either a continuous or discrete form. It is referred to as a time-continuous model when a mathematical model is used to represent the relationship between continuous signals in time. Differential equations are a common tool for describing relationships of this kind. A discrete model is a type of model that establishes a direct relationship between the values of the signals at the times when they are sampled. Differential equations are the standard way to describe a model of this kind. Continuous models are categorized as dynamic and static. Dynamic models anticipate how quantities change over time; hence, a dynamic model is typically expressed as a set of ordinary differential equations with the independent variable time (t). On the other hand, static models do not produce predictions that are time-dependent. Finally, dynamic models can be categorized as grouped or distributed.

2.2 Discipline-domain approach classification of models

Two distinct academic disciplines study systems: process systems engineering (PSE) and energy economics (EE). Different factors such as theoretical foundations, technology aggregation, spatial and temporal scales, and model purposes all contribute to the categorization in the Discipline-Domain Approach of Models (seen in Figure 3). PSE typically simulates systems at the unit operation, processing plant, or supply chain scale. Each of these scales reflects a level of technology aggregation: At the plant size, different unit operations are aggregated to provide an overall conversion process, and the conversion process, together with the feedstock supply and product distribution network, are aggregated at the supply chain scale. The objective of modeling systems in PSE is to gain insight into their technological performance in order to facilitate optimal decision-making at the design, operations, and control levels. Consequently, the technological properties of the system’s components are modeled endogenously (e.g., temperature, pressure, enthalpy, Gibbs free energy, process size). Depending on the optimization goal, economic, environmental, or social aspects (e.g., raw material prices, equipment prices, product demand, interest rates, global warming potential, resource depletion) can be modeled exogenously. At low levels of technology aggregation, modeling, and simulation have traditionally been utilized in PSE for strategic, tactical, and operational decision-making [8, 11, 12].

Figure 3.

Discipline-domain approach to the classification of models.

In contrast, energy economics approaches employ models with a high level of technology aggregation: All technologies including the entire energy sector on a regional, national, or worldwide scale may be analyzed (e.g., using bottom-up models), as well as other economic sectors including manufacturing, mining, and construction, etc. (e.g., in top-down models). Models of EE are grounded in economic principles including supply and demand, as well as market equilibrium. Thus, technological, environmental, or social elements may be treated exogenously, while the economic characteristics of the system’s components are modeled endogenously. Modeling systems in EE have traditionally served to support policymaking at the regional, national, or global scales. Capital intensiveness, long gestation periods, and long payback periods all encourage long-term thinking, and addressing sustainability issues requires the planning of energy transformation pathways that may take decades to mature. These factors all contribute to the common use of long-time scales in EE models. However, the advent of intermittent-supply renewable technologies, as well as energy market deregulation, has resulted in the recent use of EE models for operational decision-making. The PSE models include supply chain component spatial distribution, while the EE model simply abstracts economic features. The PSE models just cover the supply chain, but the EE model often incorporates all national energy sources. Energy supply corporations and enterprises utilize PSE supply chain models for strategic, tactical, and operational decision-making, whereas national planners and public policy authorities use EE models [8, 13, 14].

2.3 Multi-scale approach classification of models

Multi-scale modeling and simulation is a rapidly growing subject of study that encompasses a variety of disciplines, including physics, biology, chemistry, mathematics, statistics, chemical engineering, mechanical engineering, and materials science. Multi-scale simulation allows for the connection of phenomena at multiple scales, including the quantum, molecular, mesoscopic, device, and plant scales. The primary challenge of multi-scale modeling is to integrate information from multiple simulation models in a consistent manner so that the behavior of the entire system can be predicted from its constituent models. Recent modeling advancements have made it possible to predict behavior extending from the quantum- and nano-scales to the macroscopic and continuum scales. Given the ability to predict the behavior of systems (such as materials) at multiple scales, the natural next step is to utilize these models to design systems at several scales, a process known as multiscale simulation-based design (MSBD). MSBD’s primary challenge, in contrast to multi-scale modeling, is to effectively and efficiently utilize information generated by a diverse set of models that predict system behavior at multiple scales. System-level design is carried out concurrently with parts, sub-assemblies, and the corresponding material [15, 16].

Systems are designed using micro- to nano-scale analytical, experimental, and computational approaches to assess performance. Molecular-based approach to product and process engineering has been made possible by the significant rise in computer speed over the previous decades. Molecular and quantum modeling/simulations, such as molecular dynamics (MD), quantum Monte Carlo (QMC), and quantum mechanical (QM-ab initio and density functional theory (DFT) computations, have emerged as premier computing tools for science and engineering research. These calculations provide information to mesoscale continuum simulation system(s). In order to analyze the behavior of a system(s), bridges are constructed between models at multiple lengths and time scales in close conjunction with companion experiments. Properties at micro-scale surfaces depend on nanoscale interfaces. When building systems on different levels, this hierarchy must be taken into account. This increases the design’s coupling, thereby increasing the problem’s complexity. Although design complexity is a challenge in multiscale design, the benefit of designing systems at multiple scales is increased design freedom (i.e., greater flexibility in configuring the system to accomplish the desired behavior), which enables designers to achieve improved performance.

Figure 4 depicts the multiscale systems engineering concept, which focuses on the analysis and evaluation of systems across various scales of time, length, or both. Multi-scale systems analysis and evaluation necessitate the sequential or simultaneous use of key engineering components (modeling, design, synthesis, simulation, and optimization) across different time and length scales. This theme is made clear through (a) the application of coarse-grained models at higher levels, such as the plant or supply chain level, that capture phenomena occurring at lower molecular scales, (b) the incorporation of results obtained from smaller scales that are used as information in larger spatial and length scales, or (c) the construction of a holistic model that incorporates data from multiple temporal and spatial dimensions.

Figure 4.

Machine learning-assisted multiscale modeling across characteristic length vs time scale and level of technological aggregation vs decision making.

There has been a significant rise in the number of articles published about multi-scale research during the past two decades. Figure 5 shows the trend over time for the percentage of articles in Web of Science with the terms “multi-scale” or “multiscale” in the title. Based on the available data, a 100%” increase in total publication is projected by 2030.

Figure 5.

The existing (black) and projected (gray) trend for research articles featuring the words multi-scale or multiscale in the title of the published literature.

2.3.1 Challenges

Successful modeling of multiscale systems must overcome the following challenges [17, 18]:

  1. Accuracy-computational cost balance: One of the most difficult challenges in multiscale systems is balancing the need for accuracy with the computational expense. In general, employing a smaller-scale model to anticipate system performance provides a more accurate depiction of the system. Running these models at smaller scales in a broad enough domain to capture greater impacts, on the other hand, is computationally expensive.

  2. Appropriate system components to accurately model component interactions: Multiple components are frequently present in multiscale systems. All of these system components interact with one another. By reducing the number of modeled system components and interactions, the number of required calculations decreases, but the accuracy must also be improved. Therefore, modelers at each scale must take into account the appropriate system components in order to accurately simulate component interactions.

  3. Simulating necessary physical phenomena pertinent to the system: Existing multiscale systems are predominantly multi-physical in nature. Different physical principles and mathematical equations govern these physical phenomena. These phenomena may be dependent on or interdependent with one another. The influence of diverse phenomena on the precision of the prediction of the overall system behavior varies. Consequently, in order to obtain a reasonable understanding of the system, it is necessary to model phenomena that are interrelated.

  4. Modeling how scales interact with each other and connecting them so that they can be compatible physically: Multiscale modeling requires distinct physical principles, mathematical equations, and parameters to simulate phenomena at different length and time scales. Assumptions vary for each level. These models provide distinct insights into system behavior and must be merged to produce consistent system behavior. Model integration involves consistent mathematical and physical connectivity between scales. Because scales depend on each other, it’s crucial to understand their relationships. Therefore, the success of a multiscale model depends on accurately simulating the relationships among the various scales, components, and physical processes involved.

  5. Selection of appropriate models and model parameters at each scale: Modifying the model’s scope and assumptions allows for the creation of simulation models with varying degrees of authenticity. A material system, for instance, could be modeled in one, two, or three dimensions. Different models may be appropriate for forecasting the behavior of the system under investigation. It is critical to choose the appropriate collection of models and assumptions. However, for multiscale systems, the condition is crucial since it must be examined at several scales, with information from one model feeding into another. The applicability of models is also determined by the compatibility of assumptions made in various models. Appropriate model selection has a significant impact on model correctness and execution time.

  6. Managing large amounts of data at multiple abstraction levels: Each finer scale necessitates a more sophisticated theory and a deeper understanding of the system. Integrating information at various scales necessitates the use of software infrastructure as well as mathematical issues. Multiscale modeling entails issues such as synchronization of information generated by models at various scales, long run times, load balancing, capturing information at various levels of abstraction in a consistent database, and integrating distributed computational models and hardware resources.

  7. Uncertainty quantification and management, as well as its dissemination: Uncertainty quantification is essential for efficiently utilizing simulation model data. It is also critical to have information on the degree to which models may be trusted. Uncertainty is extremely prominent in multiscale models due to interactions between events at various scales, and quantifying this uncertainty in the models is challenging. Multiscale simulation models communicate uncertainty and information between models. The system-level simulation model may not be acceptable even though individual model uncertainty bounds are acceptable.

  8. Targeted refinement of models: The accuracy of the overall multiscale simulation model is determined by the quality of the comprising models at different scales and how uncertainty is magnified owing to information flow from one model to another. As a result, in order to increase the entire model’s accuracy, it is critical to identify the model with the greatest impact on total uncertainty and then refine that model in a focused manner.

  9. Adaptive selection of details and resolution: Although uncertainty is an important part of multiscale modeling and should be managed, many multiscale models can be greatly reduced, lowering model execution time while maintaining accuracy. The goal of multiscale modeling is to take advantage of such scenarios and to choose appropriate levels of detail in the models.

2.3.2 Approaches

Parameterization and concurrent coupling are some of the approaches currently used for multiscale modeling to address some of these challenges.

Parameterization is a method of capturing information from lower-level models in the form of a set of parameters and their values. The parameters can be empirical or semi-empirical, and they can be used to estimate average physics behavior on a smaller scale. At some level, all models contain parameterization. No model can completely simulate a phenomenon using the first principles. The advantage of parameterization is that it is simple to account for occurrences at lesser scales; the downside is that it is inaccurate.

Coupling is the method of “on-the-fly” utilizing a model at one scale while performing calculations at another one. Because of this, the total simulation needs to make dynamic use of a hierarchy of increasingly detailed models at decreasing scales. As a direct result of this requirement, high-performance computer options are required. As a result, the availability of suitable computational tools is the primary determinant of the capability to match models across a wide range of sizes. Improved overall model accuracy comes at the tradeoff of increased complexity and computing cost when coupling multiscale models together.

There is a practical limit to the complexity of problems that can be addressed, even by the most powerful computers now in existence. Therefore, it is important to fulfill a balance between accuracy and computing cost when integrating multiscale models through the use of an optimal combination of parameterization and coupling.

Advertisement

3. Integration of multi-scale modeling with machine learning

3.1 Machine learning

Machine learning (ML) is becoming widely acknowledged as a promising technology in a wide range of scientific applications. The idea of machine learning (ML) refers to a type of data-driven programming that can automatically learn programs by observing existing ones. Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on developing methods to simulate human intelligence in machines by simulating perception, speech, movement, and logical reasoning. ML has a close relationship with statistics, while statistics are more concerned with finding the truth in the underlying data, ML is more concerned with task performance. ML is also closely related to optimization and control theory [4].

Although there are numerous types of ML techniques, the majority of ML algorithms are founded on only three elements: (1) A model or hypothesis class that defines the ML algorithm’s functionalities. This can be thought of informally as the skeleton of the program that the algorithm generates. Frequently, these models have unrestricted parameters that can be tailored to the task at hand. (2) An objective function or loss function that outlines the behavior that the model should be aiming for and (3) A method for training or optimizing a model that defines how its parameters should be selected or modified to maximize performance on a given target.

Supervised-unsupervised-reinforcement learning, online learning, and transfer learning are the notable ML paradigms applicable to various contexts in which ML is utilized. Supervised learning describes methods that use a dataset of input/output pairs to learn a function. Supervised learning excels at image classification, automated speech recognition, and machine translation. In many circumstances, obtaining the labeled data is too expensive to employ supervised learning, or the system of interest has a decision-making process that cannot be well described by single input/output pairs. Unsupervised learning refers to techniques that seek structure within a set of inputs. Unsupervised methods are useful for analyzing and partitioning data, but the algorithm designer usually chooses key attributes, so the learned outputs may be artifacts of the algorithm rather than data attributes. The term “reinforcement learning” is used to describe a set of methods that, through the use of a sequential environment, seek to improve the performance of an agent. Instead of working over a predetermined dataset, as is the case in supervised and unsupervised learning paradigms, RL algorithms are deployed in a context where the method can influence the future state of some systems. Adaptive control and this setup share a lot of common features, but their structural assumptions about the system differ. Agent-based modeling (ABM) is similar to RL, except ABM requires manual specification of behavioral norms, while RL learns them automatically [19, 20].

In the online (or streaming) context, data points arrive one at a time, necessitating a prediction from the algorithm before the next data point is received. Because online learning models’ parameters are updated as data is analyzed, they frequently necessitate different assessment criteria than offline algorithms (discussed above). The transfer learning paradigm, as well as the related disciplines of multi-task and meta-learning, focuses on the ability of machine learning models to adapt to new tasks. Modern ML algorithms are data-hungry, meaning they require a prohibitively huge amount of data for training to function well. Transfer learning techniques use a limited amount of data from the new task to fine-tune pre-trained ML models for low-data scenarios [4, 21].

On the other hand, deep learning, decision trees, support vector machines, ensemble learning, Bayesian models, and Gaussian processes are the major algorithmic approaches used within ML. One of the most popular ML methods, deep learning, uses layers of linear and nonlinear functions and gradient-based backp ropagation methods optimize these layers. Back propagation-trained composable layers can express complex functions and generalize effectively to fresh data. Generic feed forward networks, convolutional networks, recurrent networks, and graph networks are also specialized deep learning models. Deep learning is used in supervised, unsupervised, and reinforcement learning ML paradigms. Decision trees are a type of model that iteratively divides incoming data into subsets based on the relative relevance of certain features. Decision tree approaches evaluate patterns in input data to create trees with decision nodes, which can be followed to forecast a data point based on feature values. These optimization strategies use multiple different loss functions. One of the earliest ML techniques, decision trees have recently gained popularity because of its predictive accuracy and ease of interpretation when combined with ensemble methods. Another sort of traditional ML model, support vector machines (SVMs) use either a linear model class or its nonlinear extension, the kernel hypothesis class, to make predictions. These strategies often employ a (regularized) hinge loss as their loss function, making them a so-called max-margin classifier in terms of geometry. The resulting optimization problem can be efficiently handled on a global scale, with different approaches taken in practice for different instances of the problem. Although many of the aforementioned methods can fulfill the requirements, using a group, or ensemble, of ML algorithms is a frequent approach in situations where a superior level of performance is required for a given task. The outputs of these algorithms can be averaged together by using weighting. Bayesian learning, in the context of machine learning, is a set of techniques for modeling uncertainty in data and model parameters. Bayesian approaches are prevalent in many areas of ML, and Bayesian versions of the majority of the aforementioned methods are feasible; however, Gaussian processes (GPs) are the standard approach in this class. GPs (nonlinear regression models) use covariance or kernel functions to measure point similarity. GPs model predictions and estimate uncertainty based on the model’s previous data and this makes GPs useful for Bayesian optimization [22].

3.2 Multiscale modeling meets machine learning

ML is a potent method for integrating multimodality and multifidelity data and revealing correlations between interconnected phenomena presents a unique opportunity in this regard. However, machine learning typically performs poorly with sparse data, disregards the fundamental laws of physics, and can lead to ill-posed problems or non-physical solutions. Classical physics-based simulation appears to be irreplaceable in this discipline. Multiscale modeling is an effective method for integrating multiscale, multiphysics data and identifying the mechanisms that explain the emergence of function. However, multiscale modeling alone frequently fails to effectively combine large datasets from various sources and resolutions. In this context, machine learning and multiscale modeling can naturally complement one another to produce robust predictive models that incorporate the underlying physics in order to manage ill-posed problems and investigate vast design spaces. Machine learning can incorporate physics-based knowledge in the form of governing equations, boundary conditions, or constraints to manage poorly defined problems and handle sparse and noisy data. On the other hand, to bridge the scales and comprehend the evolution of function, multiscale modeling can include machine learning to generate surrogate models, determine system dynamics and parameters, examine sensitivities, and quantify uncertainty [23].

Figure 6 depicts the integration of machine learning and multiscale modeling at the parameter level through constraining their spaces, identifying values, and analyzing their sensitivity, and at the system level through exploiting the underlying physics, constraining design spaces, and identifying system dynamics. Machine learning provides the necessary tools for supplementing training data, avoiding overfitting, dealing with ill-posed problems, developing surrogate models, and measuring uncertainty. Multiscale modeling incorporates fundamental physics to discover relevant features, investigate their interactions, elucidate mechanisms, bridge scales, and comprehend the genesis of function.

Figure 6.

The integration of machine learning and multiscale modeling on the parameter level [23].

Advertisement

4. Applications of integrated MMS and ML for design and development of novel technologies, systems, and processes

In this section, we will list some application areas of integrated MMS and ML for the design and development of novel technologies, systems, and processes.

Biological, biomedical, and behavioral sciences: More information than ever before is being gathered in the fields of biological, biomedical, and behavioral sciences thanks to recent technological advances. In order to improve people’s health, we urgently need more efficient methods of analyzing and interpreting this data. In the biological, biomedical, and behavioral sciences, it is a significant challenge to comprehend systems for which the underlying data are incomplete, and physics are not yet completely understood. In other words, with a complete set of high-resolution data, machine learning could be used to explore design spaces and identify correlations; with a validated and calibrated set of physics equations and material parameters, multiscale modeling could be used to predict system dynamics and identify causality [24].

The ultimate objective of incorporating machine learning and multiscale modeling is to provide quantitatively predictive insight into biological systems. Integrating machine learning and multiscale modeling has several applications in biological, biomedical, and behavioral systems. Metabolic networks, microbiology, immunology, cancer, neuroscience, biomechanics, and public health are some examples of these application areas. Drug side effects, for instance, have been studied using machine learning and genome-scale models [25]. Additionally, a recent study used machine learning in conjunction with multi-omics data, including proteomics and metabolomics, to accurately predict pathway dynamics, yielding both qualitative and quantitative predictions that can be used to direct synthetic biology projects [26]. To discover the metabolic dynamics modeled by coupled nonlinear ordinary differential equations that best suit the available time series data, supervised learning approaches are frequently employed. Using neural networks to map genotype to phenotype to bridge scales in cancer progression in specific microenvironments is another application [27]. Principal component analysis and neural networks have been used more frequently to study aging, Alzheimer’s disease, chaotic dynamics of epileptic seizures, and memory formation [28], among other topics. Predicting response functions, such as stress–strain relations or cell-scale rules in continuum theories of development and remodeling [29], is where machine learning has the most potential to impact the field of biomechanics. Example: a recent study characterized the dynamic growth and remodeling during heart failure at the molecular, cellular, and cardiac levels by integrating machine learning and multi-scale modeling. The ordinary differential equations of disease spreading have recently been applied to describe the prion-like spread of neurodegenerative diseases, where the parameters could be found via machine learning from magnetic resonance imaging [30].

Material sciences: Many of the twenty-first century’s difficulties, such as low-carbon energy and sustainability, are mostly material-related. Materials having precise chemical and physical qualities for effective energy storage and conversion are desperately needed to achieve human society’s long-term development. For a long time, the development of innovative materials has been based on trial-and-error, implying a lengthy schedule and high cost that cannot match the needs for more advanced materials. Because of advances in theoretical and computational chemistry, quantum mechanics (QM) and molecular mechanics have matured as methodologies for predicting quantitative structure–property connections prior to investigations. With rapid advances in high-performance computers, high-throughput computational screening has greatly sped materials science research, allowing the properties of thousands of molecules to be computed. Density functional theory (DFT) is widely used for the computation of material structures and properties, and it has accelerated the development of materials databases with calculated properties for a large number of systems, including the Materials Project (MP) database, the AFLOWLIB consortium, the Open Quantum Materials Database (OQMD), and MaterialGo (MG) [31, 32, 33, 34]. Researchers can use cutting-edge supercomputers and algorithms to calculate molecules with thousands of interacting ions and electrons using QM approaches. The enormous computing cost of QM-based approaches, on the other hand, limits their use to large-scale complex systems. Furthermore, using QM approaches to exhaust all potential systems is unfeasible.

The Materials Genome Initiative (MGI) is bringing huge materials data, and greater attempts have been made to collect materials attributes and develop materials databases. Accelerating materials design requires massive data management and use. Current materials research requires fast and effective analysis of vast data to identify hidden rules. With the rapid increase of materials databases, the progressive adoption of ML toolkits like TensorFlow, Pytorch, and scikit-learn, the creation of workflow toolkits like Atomate, and the progress of algorithms, ML is being used more in materials research [35, 36]. Using big data, ML approaches have generated several discoveries in energy storage and conversion materials like catalysts and batteries (Figure 7) [37, 38, 39].

Figure 7.

Development in methods to accelerate new materials discovery.

Mathematical sciences: Models that are based on ordinary or partial differential equations might be as simple as a single equation or as complex as a vast system of equations or stochastic differential equations. This implies that the number of parameters is typically large and can easily reach hundreds or more. There are many applications of ordinary and partial differential equations that integrate machine learning and multiscale modeling for systems.

Energy systems: Integrated MMS and ML have huge application potential for the development and operation of sustainable energy systems. The aforementioned global challenges can be divided into three broad categories: (1) energy security and increasing energy demand, (2) energy affordability, and (3) cleaner energy generation (reducing carbon emissions) [40, 41]. In the current state, to meet the increasing energy demand in sustainable, stable, and safe manners, synergic energy supply/demand phenomena in economic development have become increasingly prominent. Producing clean and cheap power by eliminating the economic and energy penalties requires either the development of new technologies or the improvement of existing energy systems based on sustainability standards and alternative fuels [42, 43]. However, the development of sustainable energy systems, do not negatively influence the economic and societal benefits, is the challenging global target of the energy sector. The incorporation of sustainability into the design of energy systems helps to attain this goal by avoiding or limiting negative impacts. However, monitoring sustainability performance and decision-making are critical and challenging processes for defining sustainability levels. Thus, the development of a methodological approach with proper sustainability indicators ensures the capability of gathering and abstracting complex process operations and provides straightforward analysis and communication [44].

As can be seen in Figure 8, each of the aforementioned global issues propels advancement in a variety of emerging and mature technology domains and research fields. Each of these technical domains and research fields will necessitate the discovery and development of ideas on smaller scales, followed by an assessment of their commercialization potential, integration of sustainability, and lastly supply chain optimization. This sequential flow displays the multi-scale nature of the described challenges, technology domains, and research topics, as well as the opportunity for multi-scale systems engineering to have a beneficial impact. To sum up, to eliminate the existing problems and reach the global goals, the research activities need a robust, efficient, reliable methodological approach that spans multi-scale modeling and simulation (MMS)-based analysis-design-synthesis-optimization, and integration of machine learning (ML). To bridge the gap between atomistic, mesoscopic, and macroscopic systems of interest, powerful computational techniques have been developed. As a result of this development, multi-scale systems engineering studies have produced significant outcomes for the oil, natural gas, and coal industries, as well as the renewable energy and environmental fields.

Figure 8.

The summary of the sequential flow from global challenges to sustainable energy systems.

Electricity supply and demand, optimizing energy systems, maximizing renewable power generation, reducing fossil fuel impacts, predictive maintenance and fault detection, planning sustainable energy infrastructure, managing energy systems data, developing next-generation sustainable energy technologies, and informing policy are some of the application areas of integrated MMS and ML in energy systems.

Digital twin: The high-tech industry is growing rapidly, and new technologies like Digital Twins (DT) are emerging. DT technology is an innovative interactive system that can control interactions between the actual and virtual worlds. It’s a center of learning on a worldwide scale. It is integrated with other technologies and used in various industries such as smart factories in industrial production, digital model of life in the medical field, smart city building, aerospace field, immersive shopping in commercial field, etc. Digital Twins are largely explained conceptually, with few practical applications. DT is a technology that incorporates several disciplines, and it presently lacks a unified definition, which is still being developed and evolved. However, according to the NASA definition, “DT is a comprehensive multi-physical, multi-scale, probabilistic simulation system for vehicles or systems. It uses the best physical model to describe the historical use of equipment to reflect the life of its corresponding physical equipment.”

DT has the following characteristics: (1) Concentration: Digital mainframes retain all data collected throughout a physical system’s lifetime under centralized and unified control, streamlining data exchange in both directions. (2) Integrity: For complex systems, DT integrates all subsystems, which is the foundation for high-precision modeling, while real-time data monitoring can further enrich and expand the model, allowing the model to incorporate all system knowledge and (3) Models can be dynamically updated based on sensor data characterizing the physical system’s surroundings or status, updated models can dynamically direct actual operation, and real-time interaction between physical systems and digital models allows models to grow and evolve. Integrated MMS and ML is a highly effective and promising tools that utilized a variety of techniques to create a Digital Twin.

Of course, the application areas are not limited as discussed above, the areas can be extended and detailed.

Advertisement

5. Conclusions

At the intersection of machine learning and multiscale modeling, a plethora of fascinating new applications are being created. Learning by machine and modeling at multiple scales naturally complement one another and are mutually beneficial to one another. Both machine learning and multiscale modeling have the ability to forecast system dynamics, which allows for the identification of causation. Machine learning can search enormous design spaces to find correlations. The combination of machine learning with multiscale modeling has the potential to have a significant impact in a wide variety of fields, including the biological, biomedical, and behavioral sciences, the material sciences, the mathematical sciences, energy systems, digital twins, and other related fields. Applications unquestionably become more complex as time passes, necessitating an increased awareness of the inherent constraints of overfitting and data bias on the part of developers. The need for increased transparency, rigor, and repeatability will be a significant obstacle to overcome in order to make headway in this subject.

Advertisement

Acknowledgments

The author acknowledges the support of the Qatar University grants QUST-1-CENG-2023-914 and QUST-1-CENG-2023-915.

References

  1. 1. Salciccioli M, Stamatakis M, Caratzoulas S, Vlachos DG. A review of multiscale modeling of metal-catalyzed reactions: Mechanism development for complexity and emergent behavior. Chemical Engineering Science. 2011;66(19):4319-4355. DOI: 10.1016/j.ces.2011.05.050
  2. 2. Floudas CA, Niziolek AM, Onel O, Matthews LR. Multi-scale systems engineering for energy and the environment: Challenges and opportunities. AIChE Journal. 2016;62(3):602-623. DOI: 10.1002/aic.15151
  3. 3. Vlachos DG. A review of multiscale analysis: Examples from systems biology, materials engineering, and other fluid–surface interacting systems. Advances in Chemical Engineering. 2005;30:1-61. DOI: 10.1016/S0065-2377(05)30001-9
  4. 4. Donti PL, Kolter JZ. Machine learning for sustainable energy systems. Annual Review of Environment and Resources. 2021;46:719-747. DOI: 10.1146/annurev-environ-020220-061831
  5. 5. Volkova VN, Kozlov VN, Mager VE, Chernenkaya LV. Classification of methods and models in system analysis. In: 2017 XX IEEE International Conference on Soft Computing and Measurements (SCM). St. Petersburg, Russia: IEEE; 2017. pp. 183-186
  6. 6. Heitzig M, Sin G, Sales-Cruz M, Glarborg P, Gani R. Computer-aided modeling framework for efficient model development, analysis, and identification: Combustion and reactor modeling. Industrial and Engineering Chemistry Research. 2011;50:5253-5265. DOI: 10.1021/ie101393q
  7. 7. García-Rodríguez del LC, Prado-Olivarez J, Guzmán-Cruz R, Rodríguez-Licea MA, Barranco-Gutiérrez AI, Perez-Pinal FJ, et al. Mathematical modeling to estimate photosynthesis: A state of the art. Applied Sciences. 2022;12:5537. DOI: 10.3390/app12115537
  8. 8. Subramanian ASR, Gundersen T, Adams TA. Modeling and simulation of energy systems: A review. Process. 2018;6:238. DOI: 10.3390/pr6120238
  9. 9. Yoro KO, Daramola MO, Sekoai PT, Wilson UN, Eterigho-Ikelegbe O. Update on current approaches, challenges, and prospects of modeling and simulation in renewable and sustainable energy systems. Renewable and Sustainable Energy Reviews. 2021;150:111506. DOI: 10.1016/j.rser.2021.111506
  10. 10. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Computing Surveys. 2019;51:1-42. DOI: 10.1145/3236009
  11. 11. Foley AM, Gallachóir BÓ, Hur J, Baldick R, McKeogh EJ. A strategic review of electricity systems models. Energy. 2010;35:4522-4530. DOI: 10.1016/j.energy.2010.03.057
  12. 12. Ventosa M, Baíllo Á, Ramos A, Rivier M. Electricity market modeling trends. Energy Policy. 2005;33:897-913. DOI: 10.1016/j.enpol.2003.10.013
  13. 13. Elia JA, Baliban RC, Floudas CA. Nationwide energy supply chain analysis for hybrid feedstock processes with significant CO2 emissions reduction. AICHE Journal. 2012;58:2142-2154. DOI: 10.1002/aic.13842
  14. 14. Lopion P, Markewitz P, Robinius M, Stolten D. A review of current challenges and trends in energy systems modeling. Renewable and Sustainable Energy Reviews. 2018;96:156-166. DOI: 10.1016/j.rser.2018.07.045
  15. 15. Olson GB. Computational design of hierarchically structured materials. Science. 1997;277(5330):1237-1242. DOI: 10.1126/science.277.5330.1237
  16. 16. Olsen GB. Pathways of discovery designing a new material world. Science. 2000;228(12):933-998
  17. 17. Miller RE. Direct coupling of atomistic and continuum mechanics in computational materials science. International Journal of Computational Materials Science. 2003;1(1):57-72. DOI: 10.1615/IntJMultCompEng.v1.i1.60
  18. 18. Rudd RE, Broughton JQ. Concurrent coupling of length scales in solid state systems. Physica Status Solidi (b). 2000;217:251-291. DOI: 10.1002/3527603107.ch11
  19. 19. Busoniu L, de Bruin T, Toli’c D, Kober J, Palunko I. Reinforcement learning for control: Performance, stability, and deep approximators. Annual Reviews in Control. 2018;46:8-28. DOI: 10.1016/j.arcontrol.2018.09.005
  20. 20. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, et al. Mastering the game of go with deep neural networks and tree search. Nature. 2016;529(7587):484-489. DOI: 10.1038/nature16961
  21. 21. Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, et al. A comprehensive survey on transfer learning. Proceedings of the IEEE. 2020;109:43-76. DOI: 10.1109/JPROC.2020.3004555
  22. 22. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge, MA: MIT Press; 2016
  23. 23. Alber M, Buganza Tepole A, Cannon WR, De S, Dura-Bernal S, Garikipati K, et al. Integrating machine learning and multiscale modeling—Perspectives, challenges, and opportunities in the biological, biomedical, and behavioral sciences. NPJ Digital Medicine. 2019;2(1):115. DOI: 10.1038/s41746-019-0193-y
  24. 24. Lytton WW et al. Multiscale modeling in the clinic: Diseases of the brain and nervous system. Brain Information. 2017;4:219-230. DOI: 10.1007/s40708-017-0067-5
  25. 25. Shaked I, Oberhardt MA, Atias N, Sharan R, Ruppin E. Metabolic network prediction of drug side effects. Cell Systems. 2018;2:209213. DOI: 10.1016/j.cels.2016.03.001
  26. 26. Costello Z, Martin HG. A machine learning approach to predict metabolic pathway dynamics from time-series multiomics data. NPJ Systems Biology Applications. 2018;4:19. DOI: 10.1038/s41540-018-0054-3
  27. 27. Gerlee P, Kim E, Anderson ARA. Bridging scales in cancer progression: Mapping genotype to phenotype using neural networks. Seminars in Cancer Biology. 2015;30:3041. DOI: 10.1016/j.semcancer.2014.04.013
  28. 28. Ognjanovski N, Broussard C, Zochowski M, Aton SJ. Hippocampal network oscillations drive memory consolidation in the absence of sleep. Cerebral Cortex. 2018;28(10):1-13. DOI: 10.1093/cercor/bhy174
  29. 29. Ambrosi D, Ateshian GA, Arruda EM, Cowin SC, Dumais J, Goriely A, et al. Perspectives on biological growth and remodeling. Journal of the Mechanics and Physics of Solids. 2011;59:863-883. DOI: 10.1016/j.jmps.2010.12.011
  30. 30. Weickenmeier J, Jucker M, Goriely A, Kuhl E. A physics-based model explains the prion-like features of neurodegeneration in Alzheimers disease, Parkinsons disease, and amyotrophic lateral sclerosis. Journal of the Mechanics and Physics of Solids. 2019;124:264-281. DOI: 10.1016/j.jmps.2018.10.013
  31. 31. Jain A, Ong SP, Hautier G, et al. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Materials. 2013;1:011002. DOI: 10.1063/1.4812323
  32. 32. Calderon CE, Plata JJ, Toher C, et al. The AFLOW standard for high-throughput materials science calculations. Computational Materials Science. 2015;108:233-238. DOI: 10.1016/j.commatsci.2015.07.019
  33. 33. Kirklin S, Saal JE, Meredig B, et al. The open quantum materials database (OQMD): Assessing the accuracy of DFT formation energies. NPJ Computational Materials. 2015;1:15010. DOI: 10.1038/npjcompumats.2015.10
  34. 34. Jie J, Weng M, Li S, et al. A new MaterialGo database and its comparison with other high-throughput electronic structure databases for their predicted energy band gaps. Science China Technological Sciences. 2019;62:1423-1430. DOI: 10.1007/s11431-11019-19514-11435
  35. 35. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python. Journal of Machine Learning Research. 2011;12:2825-2830
  36. 36. Mathew K, Montoya JH, Faghaninia A, et al. Atomate: a highlevel interface to generate, execute, and analyze computational materials science workflows. Computational Materials Science. 2017;139:140-152. DOI: 10.1016/j.commatsci.2017.07.030
  37. 37. Beck DAC, Carothers JM, Subramanian VR, Pfaendtner J. Data science: accelerating innovation and discovery in chemical engineering. AIChE Journal. 2016;62:1402-1416. DOI: 10.1002/aic.15192
  38. 38. Fujimura K, Seko A, Koyama Y, et al. Accelerated materials design of lithium superionic conductors based on first principles calculations and machine learning algorithms. Advanced Energy Materials. 2013;3:980-985. DOI: 10.1002/aenm.201300060
  39. 39. SendekAD, Yang Q, Cubuk ED, et al. Holistic computational structure screening of more than 12000 candidates for solid lithium-ion conductor materials. Energy & Environmental Science. 2017;10:306-320. DOI: 10.1039/C6EE02697D
  40. 40. Pichardo PA, Karagöz S, Manousiouthakis VI, Tsotsis T, Ciora R. Techno-economic analysis of an intensified integrated gasification combined cycle (IGCC) power plant featuring a combined membrane reactor-adsorptive reactor (MR-AR) system. Industrial & Engineering Chemistry Research. 2019;59:2430-2440. DOI: 10.1021/acs.iecr.9b02027
  41. 41. Karagöz S, da Cruz FE, Tsotsis TT, Manousiouthakis VI. Multi-scale membrane reactor (MR) modeling and simulation for the water gas shift reaction. Chemical Engineering and Processing-Process Intensification. 2018;133:245-262. DOI: 10.1016/j.cep.2018.09.012
  42. 42. Karagöz S, Chen H, Cao M, Tsotsis TT, Manousiouthakis VI. Multiscale model based design of an energy-intensified novel adsorptive reactor process for the water gas shift reaction. AICHE Journal. 2019;65(7):e16608. DOI: 10.1002/aic.16608
  43. 43. Karagöz S, Tsotsis TT, Manousiouthakis VI. Multi-scale model based design of membrane reactor/separator processes for intensified hydrogen production through the water gas shift reaction. International Journal of Hydrogen Energy. 2020;45(12):7339-7353. DOI: 10.1016/j.ijhydene.2019.05.118
  44. 44. Karagöz S. A methodological sustainability assessment to process intensification (MSAtoPI) by reactive-separation systems. Fuel. 2023;348:128562. DOI: 10.1016/j.fuel.2023.128562

Written By

Seçkin Karagöz

Submitted: 02 June 2023 Reviewed: 03 June 2023 Published: 07 August 2023