10 Complex components

10.1 Introduction

The complex component model has been developed in order to support the calculation of basic parameters for components characterized by multiple, time-variant failure modes (the component events). Thus the “bath tub” curve of failure rates can be modeled. Each failure mode can be defined to be either safe or dangerous.

In the previous sections related to fault trees or Markov models, the mean value h has been calculated as the arithmetic mean value h = havg = -1
T 0T h(t)dt for the given system lifetime T. This arithmetic mean value can be used only if the unavailability of the system is small. For all practical system for which the mean failure rate (or hazard rate) h is interesting at all, this condition is fulfilled, because these systems will be restored (repaired or replaced) if they are defect. Even if the system is not continuously used, and its failure might not be immediately detected, therefore, the arithmetic mean value is a good approximation of the mean hazard frequency over system lifetime. The arithmetic mean value can also be used if the system lifetime T is short compared to the mean lifetime of each component, i. e. if the MTTF of each component is greater than the system lifetime.

In order to understand the necessity of the complex component model, have a look at figure 67.


PIC

Figure 67: A component with several failure modes.


This figure shows typical variations of failure rate h(t), failure density f(t) and unreliability F(t) over time. If you have such components in your system, you’ll have to determine a mean failure rate λ = h that you can use in a fault tree or Markov model in steady-state evaluation, or a time dependent failure rate h(t) that can be used for transient evaluation. If the component is very “good’’, i. e. if it will probably not fail during system lifetime, you can use the arithmetic mean failure rate h for use in other models. For transient evaluation, you can directly use the time dependent failure rate h(t) in that case.

If the component is likely to fail during system lifetime, its mean failure rate h has to be calculated via its MTTF 5. For all failure distributions, the MTTF is given by

          ∞
         ∫
MTTF   =    t ⋅ f(t)dt
          0
(54)

With the general relationship between any failure density and failure rate function

                            t∫
                          -  h(τ)dτ
f(t) = h(t) ⋅ R (t) = h(t) ⋅ e 0
(55)

the MTTF can be calculated based on the failure rate function h(t) by

          ∫∞          -∫th(τ)dτ
MTTF    =    t ⋅ h (t) ⋅ e 0   dt

          0
(56)

This is the complete MTTF or natural MTTF, that can be found by experiment if you operate many of these components until they fail. The arithmetic mean value of all times until failure will be the MTTF. For the component shown in figure 67, the natural MTTF is about 25 000 h. If you use this component in a system that you intend to use for 200 000 h, you’ll need 7 (maybe 8) spare components per system throughout its lifetime. The mean failure rate is h = --1--
MTTF 4E-5 /h. 6

If a failure of this component is associated with a significant damage of other parts of the system (as for example the rupture of the timing belt of a Diesel engine), or if the failure is even safety critical, and the failure rate function has a considerable increasing part (similar to that shown in figure 67), preventive change makes sense. In case of preventive change at time T = Tchange, the incomplete MTTF(T) or effective MTTF(T) is given by



MTTF(T)
= ∫T
  t ⋅ f(t)dt + T ⋅ R(T )
0---------------------
        F (T ) = ∫T
  t ⋅ f(t)dt + T ⋅ (1 - F (T))
0---------------------------
           F (T) = T∫
  t ⋅ f(t)dt + T
0--------------
    F (T ) - T


=              t∫
∫T         - 0 h(τ)dτ
0 t ⋅ h(t) ⋅ e     dt + T
-------------T∫-----------
            -  h(t)dt
       1 - e 0 - T
(57)
For T →∞ the incomplete MTTF(T) migrates to the complete MTTF.

Finally imagine some kind of a normally closed electromagnetic valve with the following main failure modes:

Let’s assume that failures to open are safe, failures to close are dangerous in a given application. Let’s further assume, that some failures are early failures (production faults), some are late failures (due to wear), some might have a nearly constant failure rate. Obviously you have to distinguish between safe and dangerous failure modes in some way, and thus an overall component failure rate and a dangerous component failure rate hd or hd(t). Let’s finally assume, that the valve is supposed to be replaced in certain intervals — how will this action affect the overall dangerous failure rate, and which is the optimal interval to replace the component?

The effective dangerous MTTF(T) is given by



MTTFd(T)
=  T
∫ t ⋅ f(t)dt + T ⋅ R (T )
0
------∫T--------------
        φd (t)dt
       0 = T
∫ t ⋅ h (t) ⋅ R (t)dt + T ⋅ R (T)
0
------∫T--------------------
        hd(t) ⋅ R(t)dt
      0


=              t∫               ∫T
∫T         - 0 h(τ)dτ        -0 h(t)dt
0 t ⋅ h(t) ⋅ e     dt + T ⋅ e
-------------------t∫---------------
        T∫        -  h(τ)dτ
          hd(t) ⋅ e 0    dt
        0
(58)
where hd(t) denotes the dangerous failure rate and h(t) the overall failure rate. Obviously, this formula cannot be calculated with universal spreadsheet tools or calculators due to the complicated double integrals, that cannot be expressed in closed-form.

In contrary to all other kinds of models, complex components make no use of generic basic events, since their failure modes are directly stated in the model. Each component event (failure mode) can be specified to be safe or dangerous, see section 10.4 below for details.

Summary

The complex component model will calculate the following values according to the component events and the component properties: .

Mean time to failure:
The (natural, complete) MTTF is the mean life time of the component. It is relevant if it is short compared to the (intended) system lifetime and the component is not exchanged in certain intervals, but operated until it fails.
Mean time to dangerous failure:
The (natural, complete) MTTFdang is relevant if the components overall MTTF is short compared to the (intended) system lifetime and the component is not exchanged in certain intervals, but operated until it fails.
Mean time to dangerous failure with preventive exchange:
If the actual component life time T in the application is much shorter than the MTTF, the failure modes with increasing failure rates will occur rarely, thus the reliability and safety of the component tends to be determined by the failure modes with decreasing or constant failure rates. Therefore in that case the MTTFdang is calculated for the actual mean component life time T as well. Often T is defined by a preventive exchange interval, so this value is named MTTFdang,prev(T). Nevertheless T might also be given by the overall system life time (stated in the project properties dialog) or by the change interval or the MTTF of the assembly group that contains the component.
Mean failure rate:
The mean failure rate λ is the reciprocal of the MTTF.
Mean dangerous failure rate:
The mean dangerous failure rate λdang is the reciprocal of the MTTFdang.
Mean dangerous failure rate with preventive exchange:
The mean dangerous failure rate for shortened life time λprev,dang is the reciprocal of the MTTFdang,prev(T) (if it exists).
Time to 10% failed components:
The time at which 10% of a large number of components have failed either safely or dangerously T(B10). This is equivalent to the time at which the overall reliability R(T(B10)) = 0.9 or the overall unreliability F(T(B10)) = 0.1.
Time to 10% dangerously failed components:
The time at which 10% of a large number of components have failed dangerously T(B10d). This is equivalent to the time at which the dangerous unreliability Fd(T(B10d)) = 0.1. Note that this value only exists, if at least 10% of the components can fail dangerously at all, that is not more than 90% of the components already failed safely before.

The complex component model can be linked to all other models, such as fault trees and Markov models. If the complex component is referred in other models by links, the mean occurrence rate h, the mean unavailability Q, the unreliability F(T) or the time variant values h(t), Q(t) or F(t) are transferred as described in section 2.4 and section 10.3 below.

10.2 The Component Properties Panel

The component properties are presented, if no event is selected (see figure 68).


PIC

Figure 68: A component with several failure modes (component events) with time-variant failure rates.


All properties of the complex component are stored in the component file (extension .cmp). A complex component that has not been saved after the latest modification is marked with an asterisk ‘*’ in its title.

10.2.1 General Properties

Description: A user defined description of the complex component.

10.2.2 Component Properties

Max component life time If the component life time is defined by preventive exchange in certain intervals or by the life time of the assembly group of which it is part, this can be stated here.

If the stated value is greater than zero, less than the system life time defined in the project properties dialog, and less than the complete MTTF of the component, the stated value will be used for the calculation of the effective MTTF and some other values (see below). Otherwise if the complete MTTF is greater than the system life time defined in the project properties dialog, the system life time is used for the calculation of these values. If both conditions are false, the MTTF is used for further calculations.

Repairable or replaceable If the component is tested in certain intervals (and replaced if a fault is detected), select this checkbox and set the following values accordingly.

Test interval: The interval in which the component is tested for hidden (“sleeping”) faults.

Test time offset: If tests of several components are performed with a defined time difference, this offset (related to system start time) can be stated here. Overall system unavailability will become less if tests are performed at different times. This value is only used in transient evaluation.

Repair time: The time needed to repair the component in case a fault has been detected in the test.

10.2.3 Evaluation mode

Note: In this section, whenever h or h(t) is written, the (dangerous) failure rates handed over to any higher level model are meant.

Select which value is of interest. In fact there is no difference in the algorithm, only some warnings might differ, and the value(s) displayed in the graphics tab.

As for fault trees and Markov models you can select between steady-state or transient evaluation mode. In case of transient evaluation, the time interval must be set as well.

In case of steady-state evaluation, the mean values h or Q or the maximum unavailability Qmax or the final unreliability F(T) will be handed over to any higher level model (fault tree or Markov model). In case of transient evaluation, if the higher level model is evaluated in transient mode as well, the values h(t), Q(t) or F(t) will be handed over to the higher level model.

10.3 Values handed over to higher level models

The calculation of unavailability and unreliability depends on the expected event that most probably determines the end of life of the component. In reality, the life of a component will end when one of the following events occurs:

Case 1:
The life of the overall system ends (as defined in the project properties dialog).
Case 2:
The life of the subsystem, which contains the component, ends, or the component is replaced in predefined intervals (defined by the parameter component life time, see section 10.2.2.1 above).
Case 3:
The component fails safely (so that the system goes to a safe state). The component must be replaced by a new component before the system can be used again.
Case 4:
The component fails dangerously. This case has three sub-cases:
Case 4a:
The component is part of a safety barrier or a redundancy. Thus its failure doesn’t directly lead to an accident. Those components are normally checked in certain intervals. If a fault is detected, the component is replaced by a new one.
Case 4b:
The dangerous failure of the component directly leads to an accident (or at least to a hazard, that is immediately detected), but not to the loss of the overall system. The component will be replaced after all and the system will continue operation.
Case 4c:
The dangerous failure of the component directly leads to a loss of the system (e. g. a severe accident, that destroys the system).

Depending on the design of the component and the system, either one of the cases might be the by far most probable case, or it might not be clear, which case will usually end the component’s life. In Functional Safety Suite the following definition applies:

corresponding to case 1:
If the (natural) MTTF is greater than the system lifetime Tsys and there is no lifetime limitation set (see section 10.2.2.1 above), it is assumed that the component’s life usually ends with the overall system life. Thus the mean (dangerous) occurrence rate is given by
--
h = 1∕MTTFdang  (Tsys)
(59)

The dangerous occurrence rate hdang(t) is equal to the sum of the time variant failure rates of the dangerous failure modes

                 ndang
                 ∑
h(t) = hdang(t) =    hi,dang(t)
                  i=1
(60)

corresponding to case 2:
Else if there is a lifetime limitation Tlife,max (see section 10.2.2.1 above) and the (natural) MTTF is greater than this limitation, it is assumed that the component’s life ends at Tlife,max. Thus the mean (dangerous) occurrence rate is given by
--
h = 1∕MTTFdang   (Tlife,max )
(61)

The dangerous occurrence rate hdang(t) is equal to the sum of the time variant failure rates of the dangerous failure modes

                  n∑dang
h (t) = hdang(t) =     hi,dang(t mod Tlife,max)
                  i=1
(62)

corresponding to cases 3 and 4:
Else it is assumed that the component’s life usually ends with component failure. Thus the mean (dangerous) occurrence rate is given by
--
h =  1∕MTTFdang
(63)

Since in these cases the exchange times cannot be predicted, the time-variant (dangerous) occurrence rate hdang(t) cannot be predicted. Therefore the mean occurrence rate h is used also in transient evaluation.

Regarding unavailability Q and Q(t), it is assumed that safe failures don’t contribute to (safety related) unavailability. This is reasonable, since “safe failure” means, that the overall system goes to a “safe state”, and thus it doesn’t matter for safety, how frequently this state is entered and for how long the system stays in this state. Therefore Q and Q(t) depend on the frequency of dangerous failures and the mean time to detect them. If the dangerous failure cannot be detected (what includes, that it doesn’t directly lead to an accident), it will remain in the component and thus in the system, until the component is exchanged due to other events (safe failure of the component, preventive exchange, life end of the system or assembly group).

10.3.1 Unavailability of periodically tested components

Note: In this section, whenever h or h(t) is written, the (dangerous) failure rates calculated as described in section 10.2.3 are meant.

Components that can fail dangerously without immediately causing an accident are usually periodically tested.

For components that are tested periodically, Q, Qmax and Q(t) are calculated based on the mean occurrence rate similar to generic basic events of type repairable

                 - h⋅T
Q-=  -----------e----check --1----------- + 1
     h ⋅ Tcheck + h ⋅ Trepair ⋅ (1 - e-h⋅Tcheck)
(64)

In transient evaluation, if Tcheck is greater than 10 times the step time tstep, the current unavailability Q(t) is given by

                                    -
Q (t) = 1 - (1 - Qrepair) ⋅ e-(1-Qrepair)⋅h⋅((t- t0)mod Tcheck)
(65)

with t0 being the time to the first test (the “phase shift” of the test) and Qrepair the (mean) unavailability due to the repair time Qrepair = -h⋅Trepair-
h⋅Trepair+1:

               h-⋅ ((t - t) mod T   )
             -----------0-------check-
            e       h ⋅ Trepair + 1
Q (t) = 1 - --------------------------
                  h ⋅ Trepair + 1
(66)

The maximum unavailability Qmax is given by equation (66) with t = Tcheck:

                 --
                 h ⋅ Tcheck
             - -------------
            e--h-⋅ Trepair-+-1        -h⋅(Tcheck+Trepair)
Qmax  = 1 -   h ⋅ T    + 1  ≈  1 - e
                  repair
(67)

The return rate used in Markov models in steady-state evaluation is given by

     --
μ = -h - h-
    Q
(68)

In transient evaluation, if tcheck is greater than 10 times the step time tstep, the return to the origin state is performed cyclically at times ti = n Tcheck + T0 + Trepair.

10.3.2 Unavailability for non-tested components

Note: In this section, whenever h or h(t) is written, the (dangerous) failure rates calculated as described in section 10.2.3 are meant.

If the maximum lifetime of the component is limited by the preventive exchange interval Tlife,max, the (dangerous) unavailability Q(t) is given by the (dangerous) unreliability considering the replacements:

Q (t) = Fdang(t mod  Tlife,max)
(69)

The mean unavailability Q is conservatively defined as the maximum unavailability which is equal to the unreliability at the end of the exchange interval Tlife,max:

Q-=! Q    = F    (T      )
      max     dang  life,max
(70)

The same applies if the lifetime is limited by the assembly group (in case the lifetime of the assembly group is not clearly defined, enter a time significantly bigger than the mean lifetime).

If Tlife,max is not given, the maximum lifetime of the component is given by the overall system lifetime Tsys. In that case the dangerous unavailability Q(t) is equal to the unreliability due to the dangerous failure modes:

Q(t) = Fdang(t)
(71)

The mean unavailability Q is conservatively defined as the maximum unavailability which is equal to the unreliability at the end of the system’s lifetime Tsys:

-- !
Q = Qmax  = Fdang(Tsys)
(72)

10.4 The component failure mode properties panel

Figure 68 shows a complex component with six component events (different failure modes of the component). If a component event is selected, it’s properties are shown on the left, see figure 69.


PIC

Figure 69: One out of several failure modes of the component.


10.4.1 General Properties

Name: A user defined identifier of the component event. Every component event should have a different name.

Description: A user defined description of the component event.

10.4.2 Probability Values

Weibull distribution parameters In Functional Safety Suite version 6.0 each component event is assumed to be Weibull distributed. Thus the failure density function f(t) is given as

f(t) = λ ⋅ k ⋅ (λ ⋅ (t - t ))k- 1e-(λ⋅(t-t0))k
                      0
(73)

with k being the Weibull exponent.

The failure rate h(t) of the single component event is therefore given by

                         k-1
h (t) = λ ⋅ k ⋅ (λ ⋅ (t - t0))
(74)

Note that a delay time t0 > 0 only makes sense for decreasing failure rates and thus is allowed only if k 1.

T(B10) For convenience you can also specify the b10-time T(B10) for this failure mode, if you’ve got this value. Nevertheless the Weibull exponent k must be defined as well, with 3.0 being the default.

Specification by T(B10) is only supported for t0 = 0.

Is safe Each component event can be either safe or dangerous. Note that safe failures need to be modeled as well (if they exist), since they significantly affect the calculated values, including MTTFdang etc.

Note: If you’re performing non-safety related calculations (e. g. operational reliability or availability calculations), you’ll feel no need to distinguish between “safe” and “dangerous” – it’s just a failure mode. In that case, you’ll have to define all failure modes to be “dangerous”, because only the “dangerous” values will be handed over to any higher level model.

10.4.3 Background Color

The background color can be selected separately for each component event. By default, safe failure modes have light green background, dangerous failure modes have light red background. If you select white background, these default values will be used instead.

10.5 Editing of complex components

Create a complex component by File – New Component.

In order to add or delete events, select an event with the mouse, then press a button in the tool bar or select a command in the menu bar.

Changing properties of component events is done in the properties window.

10.6 The Component Chart Window

After performing a calculation of a complex component, the temporal variation of the calculated values can be visualized in a separate window.


PIC

Figure 70: A component chart window


By default, the dangerous values are displayed only: The unreliability Fd(t), the failure rate hd(t), the failure density fd(t) and the effective mean time to dangerous failure MTTFdang(T) for a component life time T. Optionally the overall values can be displayed as well, just select the corresponding check-boxes in the ‘Display’ menu. They will be presented by dashed lines.

All axis can be scaled and zoomed.

The presented graphics can be exported to a vector graphic (.svg) or a bitmap (.png). Note that in vector graphics format, the graph data is exported with original resolution, so a later printout will have a very high quality (if not reduced by the later processing).