What is FTA?
Fault tree analysis (FTA) offers one approach to root cause analysis, identifying and analyzing the root of asset issues before equipment breaks down. FTA helps in manufacturing facilities, where understanding the potential causes of system failures is crucial to preventing them.
Fault tree analysis is a deductive, top-down approach to determining the cause of a specific undesired event within a complex system. It involves breaking down the root cause of a failure into its contributing factors and representing it through a graphical model called a fault tree, which helps managers and engineers identify potential failure modes—and the probability of each failure mode—for safety and reliability analyses.
First developed in the early 1960s by Bell Laboratories to help the US Air Force understand potential flaws in the Minuteman missile system, FTA has been widely used across various industries, including the aerospace, nuclear power, chemical and automotive sectors, among others.
Maintenance managers might use fault tree analysis to:
As manufacturing environments continue to evolve and become more complex, the need for effective risk management tools like FTA becomes increasingly important. Incorporating fault tree analyses into your organization's safety analyses and reliability engineering practices can help your organization gain deeper insights into potential causes of system failure. FTA can also help improve overall performance and reduce the likelihood of costly and potentially catastrophic incidents.
Guide Delve into our exclusive guide to the EU's CSRDWith ESG disclosures starting as early as 2025 for some companies, make sure that you're prepared with our guide.
Related contentRegister for the ebook on ESG reporting frameworks
Performing a fault tree analysisPerforming a fault tree analysis is a complex process that involves seven key steps.
Step 1: Define the undesired eventBefore running your analysis, you should clearly define the undesired event you want to analyze. This event should be specific and measurable, like a component failure or a system malfunction. It’s also important to define the event in clear, consistent terms, since it serves as the starting point for your fault tree diagram.
Step 2: Identify the contributing events and factorsOnce you define the undesired event, you should start to identify the factors and events that might contribute to its occurrence. Contributing factors tend to fall into two broad categories: basic events and intermediate events.
Basic events—those events that cannot be further broken down into simpler events—are the most fundamental events in a fault tree, representing the lowest level of events you can analyze. A basic event in a fault tree for a car accident, for example, might be "the driver loses control of the vehicle".
Intermediate events are located between the lower-level basic events and the top event (the primary undesired event being analyzed). Intermediate events are caused by other events in the fault tree and, in turn, cause other events. They represent higher-level events that can be analyzed further. Using the same car accident as an example, an intermediate event in the fault tree might be "tire blows out".
Be sure to consider both internal and external events, like component failures, human error and environmental conditions. You might need to consult with subject matter experts, and/or review of historical data, incident reports and maintenance records, at this stage of the analysis.
Step 3: Construct the fault treeUsing standard gate symbols and event symbols, construct a graphical representation of the relationships between the undesired (or output) event and its contributing factors (also called input events). The fault tree should be organized hierarchically, with the undesired event at the top and the contributing factors branching out below it.
Laying out basic events is straightforward, since basic events cannot produce other events. However, including intermediate events is a bit more complex, as intermediate events require Boolean logic gates that indicate the relationships between top-level, intermediate and basic events.
There are two main types of logic gates used in fault trees: AND gates and OR gates.
Though less commonly used, NOT gates, XOR gates, K/N gates and INHIBIT gates can also help identify specific relationships between input and output events.
Intermediate events can also include undeveloped events, which are events that aren’t fully understood or haven’t been fully analyzed.
Using the various available gates will help you create a comprehensive fault tree that captures the complex interactions between the various events and factors that precipitated the undesired event.
Building a fault tree is an iterative process, so you continue to break down contributing events into their basic sub-events until the events cannot be parsed out any further. As you get new information and/or system conditions change, you might need to make several adjustments to refine the fault tree.
Step 4: Gather failure dataIn order to quantify the risks associated with the undesired event, you need to gather failure data (from historical records, industry databases, expert opinions, etc.) for the basic events in the fault tree. The failure data should be expressed as failure probabilities or failure rates, depending on the type of analysis you’re conducting.
Step 5: Perform the analysisOnce you construct the fault tree and gather the failure data, you perform the analysis, wherein you calculate the probability of the undesired event occurring and identify the most critical contributing factors. Use either a qualitative or a quantitative data analysis method.
A qualitative analysis focuses on understanding the structure of the fault tree, the relationships between events, and the identification of critical paths and minimal cut sets (the smallest set of events that can create the undesired event). Qualitative analysis can help prioritize remedial actions and identify areas for further investigation.
A quantitative methodology, on the other hand, involves calculating the probability of the undesired event occurring based on the failure probabilities of the basic events. Quantitative analysis can help inform risk management decisions and evaluate the effectiveness of proposed improvements.
Step 6: Interpret the resultsAfter performing the analysis, it’s time to interpret your results and communicate any relevant information to the necessary stakeholders.
The results of an event tree analysis depend on the quality of the input data and the assumptions made during the analysis. As such, you should view the results as a starting point for further investigation and validation, rather than a definitive conclusion.
Step 7: Implement improvements and monitor progressBased on the findings of the fault tree analysis, you implement preventive measures and improvements as necessary to eliminate or decrease the likelihood of an undesired event. Therefore, be sure to monitor the performance of these improvements and continually update the fault tree to reflect any changes in system design, operating conditions or component performance, so that your tree remains accurate—and useful—to your organization.
Benefits of fault tree analysisIntelligent asset management, monitoring, predictive maintenance and reliability in a single platform.
Learn more about IBM Maximo Application Suite Take a tour of IBM Maximo Observability IBM Instana® ObservabilityEnhance your application performance monitoring to provide the context you need to resolve incidents faster.
Learn more about IBM Instana Observability Try IBM Instana Fault tree analysis resources Article What is a root cause analysis?Learn about different tools and methodologies to conduct root cause analyses and address issues quickly.
Blog post Preventive maintenance versus predictive maintenance
Explore the differences between preventive, predictive and reactive maintenance approaches.
Report Transform your business with intelligent EAM
Wield greater control of complex asset environments by learning how intelligent enterprise asset management can help your bottom line.