Open access peer-reviewed chapter - ONLINE FIRST

Beyond the Blue Skies: A Comprehensive Guide for Risk Assessment in Aviation

Written By

Leila Halawi, Mark Miller and Sam Holley

Submitted: 25 July 2024 Reviewed: 08 August 2024 Published: 29 August 2024

DOI: 10.5772/intechopen.1006687

Aeronautics - Characteristics and Emerging Technologies IntechOpen
Aeronautics - Characteristics and Emerging Technologies Edited by Longbiao Li

From the Edited Volume

Aeronautics - Characteristics and Emerging Technologies [Working Title]

Dr. Longbiao Li

Chapter metrics overview

26 Chapter Downloads

View Full Metrics

Abstract

Risk assessment in aviation is a critical process that safeguards the safety and reliability of operations. Aviation operations encompass inherent risks, from mechanical failures to human errors and environmental factors. The significance of these risks may be severe, leading to accidents, injuries, and loss of life. Recognizing and mitigating risks is supreme in this dynamic environment, where emerging technologies and innovation constantly reshape this industry. This chapter includes an in-depth explanation of risk management and analysis, leading to the core elements of risk assessment specifically for aviation operations. We will describe the process and explore some of the applications and methodologies for Risk Assessment. Lastly, we will discuss safety management systems followed by proactive risk analysis using explainable artificial intelligence (XAI), which can enhance aviation safety and inform engineering design decisions.

Keywords

  • risk assessment
  • risk mitigation and management
  • safety management systems [SMS]
  • explainable artificial intelligence [XAI]
  • aviation operations
  • human factors

1. Introduction

Aviation operations combine different specialties and actions to guarantee air transport systems’ safety, efficiency, and reliability. Aviation operations proactively control and decrease hazards by methodically applying risk assessment, thus improving largely safety and efficiency. In this context, operations may include government regulators, company operation centers, equipment consortia, training for operators, and related parties with vested interests in the safety and efficiency of associated operations.

Modern risk management theory embraces a holistic approach, recognizing the importance of addressing all risks, regardless of their familiarity or ease of quantification.

Risk assessment, sometimes described as risk management analysis (FFA RM Hbk), typically is regarded as the second step in the risk management process and is the systematic process of recognizing, evaluating, and mitigating risks to minimize their impact on operations, which is essential in aviation to protect passengers, crew, and aircraft. It applies to flight operations, maintenance, safety management systems, regulatory compliance, and emergency preparedness. It encompasses five steps highlighted in Figure 1.

Figure 1.

Risk assessment steps.

Operational risk assessment encompasses the comprehensive process of identifying, analyzing, and evaluating risks during the operational phase.

This chapter provides a solid foundation for risk assessment concepts, components, current and emerging approaches, and processes. This chapter focuses only on the operational risks.

The remainder of this chapter follows a structured organization. Section 2 delves into understanding and explaining fundamental concepts and terminology. Section 3 explores risk and hazard identification. Section 4 delves into risk assessment. Section 5 focuses on risk mitigation, monitoring, and review. Section 6 covers Safety Management Systems (SMS), Section 7 explores the integration of Human Factors into SMS, and Section 8 examines proactive risk analysis using Explainable Artificial Intelligence (XAI). Section 9 serves as the culmination of the chapter, summarizing key insights and findings.

Advertisement

2. Fundamental concepts and terminology

In the 1930s, the aviation industry led in technological advancements but relied on trial-and-error safety protocols, known as “fly–fix–fly” [1]. Overall, from the 1930s to 1999, significant advancements in aviation safety were driven by technology, an increasing safety focus, and the establishment of regulations and risk assessment practices. As air travel expanded, efforts to prevent accidents intensified. Investigations into major incidents spurred improvements in aircraft design, pilot training, and maintenance. Regulatory bodies like the Civil Aeronautics Administration (CAA) and later the Federal Aviation Administration (FAA) were established to set comprehensive safety standards. A significant application of risk management in aviation operations was implemented by the US Army Rotary Wing with Operational Risk Management in 1990, which was later adopted by the Department of Defense in 2000 and eventually the FAA. In 2001, a shift to proactive safety occurred, with ICAO initiating the Safety Management Systems approach recommendation for use in international aviation for specific aviation sectors. In 2006, ICAO required most commercial aviation service providers to implement SMS, with the FAA adopting SMS for airports in 2010 and commercial airlines in 2018. The FAA has become proactive with risk assessment through the growth of operational risk management and SMS risk assessment methodologies to identify and address potential hazards before they materialize, emphasizing preventive measures.

Key organizations in aviation risk assessment include the International Civil Aviation Organization (ICAO) and the International Air Transport Association (IATA). the Flight Safety Foundation (FSF), Civil Aviation Authorities (CAAs), the European Aviation Safety Agency [EASA], the US Federal Aviation Administration [FAA]. the UK Civil Aviation Authority [CAA], and the European Organization for the Safety of Air Navigation [EUROCONTROL].

ICAO doc 9859 [2] is the ICAO’s guidance material for SMS implementation. The safety management manual emphasizes safety principles and data-driven decision-making. It focuses on safety intelligence, hazard identification through past incidents, and safety performance management. It discusses accident causation, safety culture, and safety performance management. Safety risk index ranges determine recommended actions for risk mitigation. Similar to the FAA approach (see below), a fifth category of likelihood is added - extremely improbable. The categories for severity differ from the FAA version and are named catastrophic, hazardous, major, minor, and negligible.

IATA published the Integrated Risk and Resilience Management Manual (IRRM), offering comprehensive guidance for aviation entities to integrate risk management and enhance emergency capabilities [3].

The FAA Risk Management Handbook emphasizes risk management and pilot safety in the aviation environment; it offers guidance for pilots to manage risk and describes the severity levels of outcomes; it focuses on risk management, workload, and safety in aviation operations. The Handbook describes assessment as risk management analysis. A flight risk assessment tool (FRAT) results in a score that covers three ranges: low (green), medium (yellow), and high (red). FRAT is a much simpler version of an Operational Readiness Evaluation. Risk assessment is characterized as the composite of likelihood (probability) and severity (consequences) of an outcome expressed as RA = P + S. The levels of likelihood include probable, occasional, remote, and improbable. For severity, the levels are catastrophic, critical, marginal, and negligible [4].

The EUROCONTROL safety regulatory requirement [ESARR4] identifies risks in air traffic management (ATM) for systematic management. It focuses on risk assessment and mitigation in ATMs. It addresses software safety objectives for risk mitigation. It enhances safety in ATMs through risk assessment and mitigation and ensures systematic hazard identification and management within approved safety levels. The Safety Regulation Commission prepares it for ATM system changes [5].

The Airport Cooperative Research Program (ARCP) Report 50, sponsored by the FAA, included guidance on improved models for risk assessment of runway safety areas. The report focuses on risk assessment models for runway safety areas and highlights variables impacting the consequences of overruns and accidents. It also provides rational, risk-based assessment tools for airport safety improvements. The report aims to solve aviation operating problems and introduce innovations. The report reviews operational experience to develop a functional hazard analysis. It also assesses runway safety risks based on obstacle proximity and speed [6].

The FAA’s comprehensive website on SMS covers ICAO policy, US policy, regulatory requirements, commercial aviation, air traffic control, and airports. The FAA Office of Airports Safety Management Systems, desk reference version 2.1. 2023 focuses on risk assessment in airport construction projects and highlights safety measures and project coordination for aviation facilities [7]. It is an adaptation of the operational safety assessment [OSA] from the FAA System Safety Handbook with mitigation measures, a two-step process starting with identifying an airport system’s ground and air elements and an operational hazard assessment. OSA and comparative safety assessment [CSA] are the main methods discussed in the reference. CSA is a risk assessment tool that defines the severity and likelihood of risks associated with each alternative under consideration. It also describes airport systems and facilities for safety risk management assessments.

2.1 Definition and terminology

This section has delineated several frequently used terms in risk assessment. The objective is to establish a standardized terminology that facilitates describing various components involved in a risk analysis.

Risk Definition: The traditional definition of risk is the possibility of loss or injury. The Risk and Insurance Management Society defines risk as an uncertain future outcome that can improve or worsen your position.

According to the committee of sponsoring organizations of the Treadway Commission (COSO), the risk is the possibility that an event will occur and adversely affect the achievement of objectives. According to ISO 31000:2015 from the International Organization for Standardization, risk is defined as “the effect of uncertainty on objectives [8].“This definition reflects contemporary perspectives on risk, acknowledging that it can lead to positive and negative outcomes. Risk is the combined answers to three critical questions: (1) What can go wrong? (2) How likely is it to happen? Moreover, and (3) What are the potential consequences?

2.2 Related terms

Other common relevant terms are listed below:

  • Accident: an abrupt, undesired, and unanticipated event or sequence that harms people, the environment, or other tangible assets.

  • Barriers: safeguards, protective layers, defenses, controls, or countermeasures are measures implemented to mitigate risks and prevent unwanted events from occurring or escalating.

  • Consequence: the significance of an event’s impact.

  • Event: event” denotes a forthcoming happening.

  • Hazard: A resource or precondition that can instigate damage, alone or in combination with other elements.

  • Incident: an unexpected undesirable and unintentional occurrence or series of events that can realistically instigate damage to one or more resources but eventually did not cause sizable damage.

  • ISO 31000: a global standard established by the International Organization for Standardization (ISO) for risk management, offering instructions and rules for creating, applying, and sustaining efficient risk management processes within companies [8].

  • Likelihood: a synonym for the probability of an event, a number between 0 and 1 (or between 0 and 100%) that indicates how likely the event is to occur in a specific situation. This probability is written as PPP. If P = 1P = 1P = 1, the event is certain to happen, while if P = 0P = 0P = 0, the event is certain not to occur.

  • Operational Risk: “the risk of loss stemming from lacking or failed internal processes, people, systems, or external events.” ISO 31000 defines it as “any event that affects an organization’s objectives.”Operational risk is a pure risk that arises from people, processes, systems, and controls. This includes a broad spectrum of risks attributable to human factors, IT risks, management oversight, and business processes [8].

  • Potential Cause: some element, circumstance, or incident that can generate or influence a risk occasion.

  • Potential Outcome: the probable effects or impacts of a risk event. This comprises both negative and positive outcomes.

  • Prevention: actions, tactics, and schemes to moderate and lessen the probability of a risk case.

  • Recovery: activities and procedures started to reestablish usual tasks and diminish the bearing following a risk event.

  • Safety Management Systems (SMS): a formal, structured process that helps organizations manage safety risk in the workplace.

  • Threat: A broad classification encompassing actions or events that can inflict harm on an asset.

Advertisement

3. Risk and hazards identification

Hazard risks are distinguished by a key characteristic: pure risks with only negative potential outcomes, such as fire, theft, or catastrophe. Hazards risks arise from property, liability, or personnel loss exposure. Identifying these risks within an organization is crucial for adequately preparing and mitigating any possible negative consequences. Operational risks are also pure risks from people, processes, systems, and controls.

Risk identification entails identifying the assets, threats, controls, vulnerabilities, and consequences. This step aims to answer what can go wrong. It aims to identify all relevant hazards and hazardous events during intended use, foreseeable misuse, and interactions with the study object while also describing their characteristics, presence within the object, and potential initiating events. Once identified, the probability and severity of the events must be assessed.

Hazard identification and risk assessment methods are crucial in aviation. Once assets, threats, and vulnerabilities are identified, organizations assess the risks quantitatively or qualitatively, prioritizing those with the greatest potential impact. Several approaches for conducting a risk assessment are available.

Some methods include checklists and brainstorming, which involves lists of generic hazards or hazardous events to determine their potential occurrence, location, and impact on the study object and provide initial insights.

Preliminary Hazard Analysis (PHA) is a straightforward method in design phases, identifying hazards with results updated as more detailed risk analyses progress; it is often sufficient for basic systems and referred to as Hazard Identification (HAZID).

Job Safety Analysis (JSA), similar to PHA, enhances safety awareness among personnel by reviewing work procedures just before tasks, ensuring readiness, and mitigating risks effectively.

Failure Modes, Effects, and Criticality Analysis (FMECA) focuses on identifying potential failure modes of system components, their causes, and the resulting impacts on the overall system reliability.

Hazard and Operability (HAZOP) studies, developed for process plants, employ structured brainstorming with guidewords to detect deviations and hazardous scenarios and are widely adopted in initial design phases and subsequent system modifications.

Structured What-If Technique (SWIFT) involves experts posing and answering structured “what-if” questions in a brainstorming format, serving as a simplified approach akin to HAZOP for various system types.

Additional hazard identification techniques are prevalent in aviation, such as predictive safety analysis using machine learning techniques and algorithms to analyze vast amounts of data from various sources, including flight recorders, weather systems, maintenance logs, and even pilot reports. It looks for subtle patterns or anomalies that might indicate potential safety issues. It is a data-driven proactive approach that can identify issues before they become critical.

Flight data monitoring analyzes aircraft systems data and continuously records numerous parameters during flight operations. This data is then analyzed to identify trends, exceedances, or deviations from standard procedures. It offers objective data on flight operations, helps identify systemic issues or trends, and can be used for targeted training and procedure improvements.

Safety management systems (SMS) are a comprehensive approach that includes risk assessment, hazard reporting, and safety assurance. SMS emphasizes continuous improvement and a proactive approach to safety rather than just reacting to incidents after they occur. SMS is an organized approach to managing safety, including systematic procedures, practices, and policies. It typically includes four components: safety policy, risk management, safety assurance, and safety promotion. It encourages a positive safety culture and facilitates continuous improvement in safety performance. However, implementing it fully can be resource-intensive, requiring commitment at all organizational levels, and may need more support due to perceived bureaucracy.

Human Factors Analysis and Classification System (HFACS) is the framework for investigating and analyzing human factors in aviation accidents. HFACS is a comprehensive framework that categorizes human error at four levels: organizational influences, unsafe supervision, preconditions for unsafe acts, and unsafe acts. It helps investigators and safety analysts understand the full context of human errors in aviation incidents. It provides a structured approach to analyzing human factors, Helps identify systemic issues that contribute to human error, and Facilitates the development of targeted interventions. However, it requires thorough training to be used effectively, can be time-consuming to apply in complex incidents, and may oversimplify some aspects of human behavior.

Advanced simulation and virtual reality training may also identify potential hazards in various scenarios without real-world risk.

Bow-tie analysis is a visual risk assessment tool that helps identify potential causes and consequences of hazards. It is valuable for conceptualizing and analyzing risks. A recognized hazardous event is positioned at the center of the diagram, with its causes depicted on the left side and the consequences on the right side. It is widely applied to assess the risks of aviation combining—two analyses (fault tree analysis and event tree analysis). The fault tree portion of the system deals with possible causes of the system’s top event, and the event tree portion is related to post-event analysis. It helps identify where additional safety measures might be needed. It also helps communicate complex risk scenarios to various stakeholders.

Advertisement

4. Risk analysis, evaluation: Risk assessment

Risk Assessment is a process that involves planning, preparing, executing, and reporting a risk analysis, followed by evaluating the results against established risk acceptance criteria. As illustrated in Figure 2, the process begins with conducting a risk analysis to identify and describe relevant accident and incident scenarios and their likelihoods, collectively defining the risk. The second phase, risk evaluation, involves comparing the risk the analysis determines with the risk acceptance criteria.

Figure 2.

Risk assessment.

Risk assessment, according to ISO 31000 [8], encompasses three essential steps:

  1. Risk Identification: This initial phase involves pinpointing potential organizational risks.

  2. Risk Analysis: This step assesses the likelihood of each identified risk occurring and its potential impact.

  3. Risk Evaluation: The risk analysis results are compared against the organization’s criteria to determine if further action is required.

Risk analysis is a systematic study that identifies and describes potential issues, their causes, probabilities, and possible consequences. While various approaches proliferate, there are four common risk assessment techniques: a risk matrix, a decision tree, failure modes and effects analysis, and the bowtie method. The assessment should be undertaken only by competent staff who understand the process, the hazard and risk, and the activity that forms the risk.

Before initiating the risk analysis, it is essential to identify the relevant assets to consider. These assets, such as people, the environment, and reputation, are outlined within the study’s scope. Risk analysis, a crucial part of a risk assessment, involves understanding the nature of the risks and determining their levels.

Risk assessment includes risk analysis and risk evaluation. A broad array of information is essential as input for any risk assessment, especially for the data requirements of quantitative risk assessments from descriptive data (technical, operational, production, maintenance, environmental, meteorological, safety, and shareholder) to probabilistic data (accident data, hazard data, natural event, human error, reliability data, and consequence data). Under directive 376/2014 in the EU, all civil aviation incidents and accident data must be collected, reported, and analyzed, supported by ECCAIRS 2 to enhance safety. The FAA manages this in the USA through the ASIAS program for sharing and analyzing safety data, while the NTSB investigates incidents to improve aviation safety. Also, The International Civil Aviation Organization (ICAO) maintains a database and publishes accident statistics on its website at [http://www.icao.int.]. Reliability and availability data in aviation is collected and analyzed through industry-specific solutions, regulation-driven reporting, and internal programs.

Risk evaluations follow where you evaluate the risk. In many cases, we present the results of a risk assessment in a traditional report and other formats, including brochures, simplified versions of the report, or presentations. The chosen format should be tailored to the stakeholders and the significance of the decisions the risk assessment is intended to support. A potential structure for a risk assessment report includes a title page, a disclaimer, an executive summary [with introduction, objectives, and limitations], an analysis approach, main conclusions and recommendations, document references, acronyms and glossary, study team, appendices and wrapping up with a section for comments.

Advertisement

5. Risk mitigation, monitoring and review

Risk Mitigation, a subset of risk treatment, refers to actions taken to reduce the likelihood or impact of a risk. It is one of the strategies that can be employed within the broader scope of risk treatment. Risk treatment alternatives entail risk reduction, modification, retention, avoidance, sharing, and acceptance, which means accepting the risk. Risk avoidance involves eliminating risk by changing business practices or operations.

Risk monitoring and review are crucial elements of the risk management process as defined by ISO 31000 [8]. These components ensure that the organization’s risk management practices remain effective and adaptable to changes in the risk environment. Risk monitoring is an ongoing process that tracks the status of identified risks and the progress of risk treatment options. Risk review entails a periodic and systematic evaluation of the risk management process to assess its suitability, adequacy, and effectiveness.

Monitoring and Reviewing Risk involves ensuring the effectiveness and continued functionality of risk reduction measures, such as verifying procedures, equipment placement, and personnel training. It includes regular inspections and retraining to maintain technical systems and emergency response capabilities. Additionally, it monitors safety performance and risk levels through performance indicators to detect and address negative trends early.

Advertisement

6. Safety management systems [SMS]

SMS has become the standard for applying risk management in the aviation industry over the last 20 years. This is because it is scientifically accurate through a standard risk assessment chart (using mathematical probability), can be flexibly tailored for any type and size of aviation-related organization, and, most importantly, allows for data-driven proactive design. While consisting of the four SMS pillars of Risk Management, Assurance, Promotion, and Policy, SMS is truly centered around the Risk Management Pillar. Assurance, Promotion, and Policy are key areas to support Risk Management. The Risk Management process in any SMS needs to include hazard identification, risk assessment, and mitigation.

In contrast, the Risk Management process is monitored and updated with the Assurance pillar by evaluating the effectiveness of risk mitigation control strategies and helping identify new hazards. The policy shows management’s support for SMS while defining the methods, processes, and organizational structure required to utilize SMS. The Promotion is about establishing an SMS culture through leadership and the training and communication required to implement and maintain the SMS running well.

SMS utilizes risk management as its cornerstone as a new millennium safety program. This is important to any organization because it places safety and risk management into a scientific realm using accurate risk assessment to convert the hazard into a mathematical probability standardized for the whole organization. This avoids pitfalls of different interpretations of data by different people in the organization and avoids guessing how severe a hazard is. Through safety policy, top management and the organization use the same risk management process and standard risk assessment table. Utilizing the standard risk management process and risk assessment table forces everyone in the organization, from top management to front-line employees, to view the hazard, the hazard analysis, the hazard assessment, the hazard mitigation, and hazard reevaluation from the same standard. While SMS enables the hazard risk to be evaluated more accurately, it also reevaluates the hazard more accurately after implementing mitigation strategies. It utilizes the same risk assessment table to lower the hazard risk after implementing mitigation controls. Suppose mitigation controls cannot appropriately manage the hazard within safe operational limits. In that case, the organization has the ethical responsibility to stop the operation until the hazard can be managed within safe operational limits. This scientific data-driven proactive approach to risk management is a vast improvement to safety programs of the past that used many different methods to manage risk along with different standards. SMS also provides a great deal of freedom to implement the risk management process by being flexible and allowing an organization to tailor the specific size and type of SMS for that aviation organization by tailoring the different SMS pillar elements. This type of customizable SMS safety program centered around risk management has allowed governing agencies like the FAA in the US to mandate and regulate SMS safety programs for different kinds of aviation organizations in the US where, in the past, safety programs were voluntary and cost-prohibitive. Being a proactive data-driven approach means that SMS is designed in each organization to identify hazards proactively through different types of data collection devices. In the past, aviation safety relied on reactive data to track incidents and investigate accidents. Then, after a thorough analysis, it identified the hazards and made recommendations to fix them. Unfortunately, in this reactive process, economic affordability could get in the way of prioritizing the hazards, and delays in mitigating many hazards could take years to resolve. With SMS, an airline can continuously monitor and analyze recorded flight parameter data from each revenue flight through a Flight Operations Quality Assurance (FOQA) program for the safety of flight operations. This does two significant things in fulfilling the data-driven SMS role. It first can identify individual crew mistakes or flight safety risks that, if deemed a hazard by the airline’s FOQA committee, could be immediately mitigated through the SMS risk management process. More importantly, the tremendous amount of data being collected and analyzed over time could be instrumental in identifying a hazardous trend experienced by many crews and revenue flights within that airline and possibly outside that airline as well, and would also call for the SMS risk management and mitigation process to be utilized.

Advertisement

7. Integrating human factors into SMS

While SMS and Risk Management have proliferated throughout the aviation industry and have become common in the US aviation industry, human error still plays a direct role in aviation in upwards to 80% of the commercial accidents in the US. With human error playing such a major role in affecting aviation safety, aviation human factors are also starting to play a significant role in combating human error in aviation. We are currently at a juncture in aviation history where human factors and risk management are beginning to merge. Current accident and incident analysis clearly shows that multiple human factors are usually involved as hazards in triggering human error to cause accidents and incidents. The big question for aviation safety experts is how best to integrate human factors into SMS. This can be seen through three examples: in an operational setting on the flight deck with Threat Error Management (TEM) as an extension of Crew Resource Management (CRM), in a Fatigue Risk Management (FRM) System for Airline Maintenance and in the form of a PSAP/FOQA database of human factors root error causes to incidents.

As TEM becomes more commonplace on the flight deck of airlines as an extension of Human Factors and CRM, it could also be considered an SMS integration to the modern operational flight deck. In the case of CRM, 40 years of bringing human factors solutions to improve the crew working together on the flight deck with flight attendants, dispatch, maintenance, ATC is all through teamwork tenets like leadership, followership, task management, automation management, communications, assertiveness, situational awareness, and decision making. TEM is a proactive add-on to these previous CRM tenets in that it identifies threats or hazards to anticipate before the flight and briefs how they are to be handled, and how to handle errors quickly when they do occur during the flight before they get out of control. TEM can also help a crew recover from an unanticipated threat to the flight and be resilient. These threats and errors are hazards that have been identified and quickly analyzed. The only difference between TEM and an official SMS Risk Management system is that the threats and errors as hazards need to be thoroughly assessed with a risk assessment chart on the flight deck. However, the pilots then implement mitigation, reassess, and reevaluate the threats and errors as hazards. This could become a more accurate form of TEM embedded in the airlines’ SMS by simply having the aircrews voluntarily add the threats and errors from each flight to an anonymous database. Pilot safety experts from that airline and potentially AI could then expertly assess and rate the threats and errors as hazards, along with how the threats and errors were mitigated during the flight. This data would make a complete loop of the current TEM model by keeping the ongoing model going in CRM and infusing that current TEM model with feedback from analyzed ongoing data to improve the airline’s TEM.

FRM has been growing in the human factors method SMS-related method over the last two decades since aviation experts in the US identified fatigue in the early 1990s as a significant safety threat to aviation. As commercial pilots have had their crew rest regulations vastly improved over the last decade, the FAA has also encouraged airlines to use FRM programs. FRM currently can take an employee’s fatigue assessment and rate that employee from a hazardous fatigue level to a low and acceptable level for work. Managers can also use it to recommend mitigation strategies to reduce fatigue levels in certain fatigued employees to make them acceptable for work. While FRM fits into the SMS program requirements quite nicely as far as Risk Management, Safety Assurance, Policy, and Promotion are concerned, the biggest challenge to making such systems commonplace is making them more accurate and more efficient to run, especially for other employee groups in the industry like Aviation Maintenance Technicians and Ramp workers that have very little in the form of rest regulations. The first problem with FRM systems is how to accurately measure fatigued workers with the multi-variable human factors that influence each individual’s fatigue and how best to measure that person’s fatigue through daily physiological measurement. The other issue beyond the challenge of the accuracy of the FRM is from a mitigation perspective for the personnel who have been identified as a fatigue hazard. Such a task could be quite challenging for any manager responsible for that personnel to manage that worker’s fatigue within acceptable working levels. It is quite possible that AI human teaming with management could prove invaluable to both data collection and mitigation management of fatigued aviation personnel in the future if AI were to be able to be integrated into both the initial fatigue assessment and fatigue mitigation parts of the FRM process [9].

The ideal example of a proactive, data-driven, human factors integrated SMS solution would be a modern US Airline Pilot Safety Action Program (PSAP) combined with an airline’s FOQA program. If every PSAP and FOQA event were broken down by the respective PSAP and FOQA committee members into a human factors root cause analysis, these root causes would greatly benefit the action of each committee in event mitigation accuracy for each crew involved. By rating each human factor’s root cause involved in each event with the company’s standard risk assessment chart, each human factor’s root cause could then be sent into an anonymous company database. Over time, commonly identified human factors root cause hazards with the highest safety risk assessment could be identified as hazardous trends and dealt with by a special airline company committee to add the mitigation strategy for the hazardous trend. Once the mitigation strategy has been implemented, the data will continue to be monitored to ensure that the mitigation step is indeed working to lower the occurrence of the hazardous trend and that its risk is managed. The key to this example is having good data collection in the PSAP reporting system combined with the data analysis of the FOQA to form a strong and proactive event database. The human factors root cause analysis of each PSAP and FOQA event could then be expanded into the form of a human factors root cause database to identify hazardous trends. The human factors root cause database serves the SMS and SMS risk management process to proactively identify, analyze, and mitigate flight hazardous trends.

The primary goal for identifying human factors elements influencing SMS objectives is to recognize what issues invite or result in human or system error and how performance is affected by error, either more likely or less so. Human factors principles and elements are prominent in SMS, FRM, and most risk management and assessment efforts. Understanding that human factors entail a wide range of settings and performance is important. The ergonomics aspects, anthropometrics, measures, cognitive processing, automation interfaces, and many other considerations come under the human factors umbrella. Typically, these factors are dynamic, subject to individual variations and permutations, and often difficult to predict or calculate. For these reasons, it is sound practice to include a human factors specialist in the risk assessment planning and procedures.

Advertisement

8. Proactive risk analysis with XAI

Explainable Artificial Intelligence (XAI): a method that helps humans understand how outputs are generated by machine or deep learning algorithms. It quantifies model correctness, fairness, and transparency, enhancing artificial intelligence [AI]-assisted decision-making.

Integrating artificial intelligence (AI) and explainable AI (XAI) in aviation risk assessment has transformed safety and managerial procedures. Proactive Risk Analysis with Explainable Artificial Intelligence [XAI] is a ground-breaking method that blends advanced AI methods with the need for transparency and interpretability in safety-critical practices. This approach seeks to predict and mitigate risks ahead of occurrences or mishaps. It symbolizes a radical advance in aviation safety, leveraging the capability of AI while keeping the severe portion of human control and recognizing safety-critical resolutions.

As we can see in Figure 3, Explainable Artificial Intelligence (XAI) is at the heart of operations. XAI employs methods and techniques to generate clear explanations about how the AI operates and the rationale behind its decisions, thereby fostering human trust. XAI in aviation provides transparent risk assessments with clear justifications.

Figure 3.

XAI risk assessment.

This approach has many advantages, from enhanced predictive capabilities in identifying delicate patterns and potential risks [10] to conducting real-time analysis, enabling proactive risk management [11]. This approach improves decision-making, comprehensive risk assessments [12], and increases efficiency. It enables standardization of different safety protocols across the industry [13].

This advanced approach encompasses multiple stages to ensure comprehensive risk management, starting with data collection, where aviation-specific data may be collected from different sources [12]. Advanced algorithms through AI/Machine learning techniques are then used to analyze the data and identify potential risks [10]. XAI techniques such as decision trees or rules-based explanations are utilized within the XAI layer [14, 15]. In stage four, the AI systems flag possible risks or uncommon models within operations. Aviation safety experts evaluate the AI’s results and XAI-generated explanations, start the risk assessment evaluation, and craft mitigation strategies. Lastly, continuous monitoring ensures the effectiveness of the strategies as they keep refining the predictions.

This approach presents challenges that require ongoing attention and refinement. It depends on the quality of the data and a significant data infrastructure. AI and predictive models are complex and need expertise in aviation and AI/ML; some stakeholders still have trust issues with the system [11, 14]. Moreover, there is potential bias [16] combined with high initial costs [17] and some legal and compliance issues [12].

More debate is needed on the potential influence of AI on risk assessment practices. One fact, though, remains constant, and that is the willingness of management to accept and implement the outcome recommendations of XAI. Delivering a more incisive and comprehensive evaluation will confront management with the same dilemma that has persevered for decades – disrupting established patterns, added cost, and reliable implementation of identified changes necessary. Acting upon XAI resolutions is still within the province of management liabilities and, consequently, a separate risk category altogether.

Through applications with Safety Management Systems models and processes, an area of rapid growth is using AI and machine learning to capture risk-related elements in several arenas like organizational practices, operator performance, and adherence to engineered standards. The computational statistics, the basis for machine learning, are regarded as not representational for general human intelligence or capable of human cognition operations. The primary interest is prediction in the decision-making process derived from data analysis [18]. As data aggregation methods and filtering processes become more available, organizations can experience substantial upgrades in risk management analyses. Some aircraft systems upload information regularly, which will be parsed into the appropriate elements of the risk assessment algorithms.

A factor related to trust in AI decisions is boundary conditions for the perceptions generated. Also, fairness in decisions generated by AI has been studied [19] to determine effects where recommendations by algorithms are preferred over those by humans, known as algorithmic appreciation. Generally, this occurs when AI decisions are compared with human expert decisions, although the same does not hold true when non-experts are involved. An added feature is the perception of risk associated with AI decisions, which makes human decisions more likely to take these considerations into account.

Advertisement

9. Conclusions

In conclusion, this chapter has offered significant insights. Large-scale decision-making is essential for daily activities in aviation operations and risk assessment. Since these decisions are made within impenetrable systems, people must understand how specific decisions are reached. Additionally, policymakers in aviation operations are increasingly concerned about the loss of explainability in AI-based systems models, which is a primary issue affecting the acceptability of AI-based decisions.

SMS is having an incredible impact in allowing the accuracy of risk management to be applied in the industry. The industry will become safer and more efficient as human factors integrate more into risk management and SMS to address the high percentage of human error concerns. Meanwhile, implementing AI into various SMS risk management systems looks promising in increasing data gathering and analysis while vastly improving the human-machine team’s decision-making capability.

Explainable AI (XAI) plays a crucial part in aviation operations. Understanding the mathematical foundations of existing machine learning architectures may offer insights into how and why a result was obtained but not into the inner workings of the models themselves. This is essential for fostering trust and confidence in AI models within organizations. Moreover, AI explainability allows organizations to adopt a responsible approach to AI development.

Advertisement

Conflict of interest

The authors declare no conflict of interest.

References

  1. 1. Ericson CA. Hazard Analysis Techniques for System Safety. 2nd ed. Wiley; 20 July 2015. p. 640. ISBN-10: 1118940385. ISBN-13: 978-1118940389
  2. 2. ICAO Safety Management Manual Doc 9859. Available from: https://skybrary.aero/sites/default/files/bookshelf/5863.pdf
  3. 3. IATA Integrated Risk and Resilience Management Manual (IRRM). Available from: https://www.iata.org/en/publications/store/integrated-risk-resilience-management-manual/
  4. 4. US Department of Transportation Federal Aviation Administration (FAA) Flight Standard Service. Risk Management Handbook. 2022. Available from:https://www.faa.gov/sites/faa.gov/files/202206/risk_management_handbook_2A.pdf
  5. 5. ESARR4 Risk Assessment and Management in ATM (Eurocontrol). 2001. Available from: https://www.eurocontrol.int/sites/default/files/2019-06/esarr4-e10.pdf
  6. 6. FAA Airport Cooperative Research Program- ACRP Report 50. Improved Models for Risk Assessment of Runway Safety Areas. Available from: https://www.icao.int/sam/documents/2011/agaaserostudies/acrp%2004-08%2050-draft%20final%20report.pdf
  7. 7. FAA Office of Airports Safety Management Systems. 2023. Available from: https://www.faa.gov/sites/faa.gov/files/ARP-SMS-Desk-Ref-v2-0.pdf
  8. 8. ISO 31000. Risk Management – Guidelines, International Standard. Geneva: International Organization for Standardization; 2018
  9. 9. Miller M, Mrusek B, Herbic J, Holley S, Halawi L. Implementing an AI fatigue risk management system for aviation maintenance SMS: A technology enhanced critical process human factors safety plan. In: Proceedings of the AHFE. Honolulu, Hawaii; 2024. Available from: https://www.hawaii.ahfe.org/program.html
  10. 10. Degas A et al. A survey on artificial intelligence (AI) and eXplainable AI in air traffic management: Current trends and development with future research trajectory. Applied Sciences. 2022;12(3):1295. DOI: 10.3390/app12031295
  11. 11. Ayhan S, Pesce J, Comitz P, Sweet D, Bliesner S, Gerberick G. Predictive analytics with aviation big data. In: 2013 Integrated Communications, Navigation and Surveillance Conference (ICNS), Herndon, VA, USA. 2013. pp. 1-13. DOI: 10.1109/ICNSurv.2013.6548556
  12. 12. Kistan T, Gardi A, Sabatini R. Machine learning and cognitive ergonomics in air traffic management: Recent developments and considerations for certification. Aerospace. 2018;5(4):103. ProQuest. Available from: https://www.proquest.com/scholarly-journals/machine-learning-cognitive-ergonomics-air-traffic/docview/2582792906/se-2. DOI: 10.3390/aerospace5040103
  13. 13. Christine B et al. Hazards identification and analysis for unmanned aircraft system operations. In: 17th AIAA Aviation Technology, Integration, and Operations Conference. Denver, Colorado; 5-9 June 2017. DOI: 10.2514/6.2017-3269. Available from: https://arc.aiaa.org/doi/book/10.2514/MATIO17
  14. 14. Ribeiro TM, Singh S, Guestrin C. ‘Why should I trust you?’: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2016. pp. 1135-1144
  15. 15. Lundberg, Scott, and Su-In Lee. “A Unified Approach to Interpreting Model Predictions.” arXiv.Org. 2017. DOI: 10.48550/arxiv.1705.07874
  16. 16. Mehrabi N et al. A survey on bias and fairness in machine learning. ACM Computing Surveys. 2021;54(6):1-35. DOI: 10.1145/3457607
  17. 17. Gürbüz F, Özbakır L, Yapıcı H. Explainable artificial intelligence for aviation safety: A review. Expert Systems with Applications. 2023;213:118877
  18. 18. Agrawal A et al. Exploring the impact of artificial intelligence: Prediction versus judgment. Information Economics and Policy. 2019;47:1-6. DOI: 10.1016/j.infoecopol.2019.05.001
  19. 19. Araujo T et al. In AI we trust? Perceptions about automated decision-making by artificial intelligence. AI and Society. 2020;35(3):611-623. DOI: 10.1007/s00146-019-00931-w

Written By

Leila Halawi, Mark Miller and Sam Holley

Submitted: 25 July 2024 Reviewed: 08 August 2024 Published: 29 August 2024