11.8. Resource Outages
For various reasons, resources can go out of order: machines may fail or may have to get scheduled maintenance, nurses or doctors may get sick or may be out-of-duty or off-shift. Consequently, both planned and unexpected resource outages may have to be modeled.
While human resources being out-of-duty or off-shift can be modeled with the help of service time calendars, scheduled maintenance may be modeled in the form of maintenance activities recurring with a fixed interval. Unexpected resource outages are resource failures.
Modeling Resource Failures
Resource failures can be modeled with the help of two types
of events: failure and recovery, and two related random
variables: time-to-next-failure and failure time. In this
approach, the next resource failure event occurs x time units after a
recovery event, where x is obtained by invoking the random variable
sampling function timeToNextFailure()
. Each failure event
triggers a recovery event, which is scheduled with a delay of y time
units where y is obtained by invoking the random variable sampling
function failureTime()
.
This simple approach applies both to non-human and human resources.
Further modeling options:
- Allow specifying a time-to-first-failure, in addition to the time-to-next-failure.
- Allow counting only the busy time of a resource, instead of the total time, for determining its next failure time.
Modeling the Repair of Failed Resources
In activity-based simulation, the basic failure modeling approach can be refined by modeling the repair of a failed non-human resource as an activity whose duration is provided by the random variable repair time and which requires a repair person as a resource, such that the total failure time is the sum of the two random variables repair lead time (the time needed for getting a repair person to start the repair) and repair time.
Then, when a failure event occurs, it triggers a
repair activity to start with a delay of x time units and a
duration of y time units, where x =
repairLeadtime()
and y = repairTime()
. The
next failure event is scheduled to occur z time units after a
repair end event, where z is obtained by invoking the random
variable sampling function timeToNextFailure()
.
Modeling the Scheduled Maintenance of Resources
Non-human resources, such as
machines, may undergo periodic maintenance for preventing them to fail in
the near future. This can be modeled by scheduling periodic
maintenance activities every x time units where x =
maintenanceIntervall()
, along with periodic failure events.
When a maintenance activity starts before the next scheduled
failure event has occurred, the simulator has to retract this event
from the Future Event List for implementing its prevention by the
maintenance activity. The next failure event is scheduled to occur
x time units after the maintenance end event, where
x is obtained by invoking the random variable sampling function
timeToNextFailure()
.
When a resource fails before its
next scheduled maintenance has started, the scheduled
maintenance is cancelled (the simulator has to retract the
scheduled maintenance start event) and, instead, the
failure event triggers a repair activity. Only when the
repair activity ends, the next maintenance activity is
scheduled with a delay of y time units where y is obtained by
invoking the random variable sampling function
maintenanceTime()
.
Further modeling options:
- Allow specifying a time-to-first-maintenance, in addition to the maintenance interval time-to-next-maintenance.
- Allow specifying the recurrence of maintenance activities with a schedule instead of a (typically fixed) maintenance interval.
Defining General Elements for Modeling Resource Outages
A simulation framework/language should support resource outage modeling by allowing to define for each (non-human) resource object type the three pairs of (typically random variable) time functions introduced above and summarized in the following class diagram.
These pairs of time functions have to be used incrementally: specifying maintenance time functions requires specifying repair time functions, which, in turn, requires specifying failure time functions.
Resource Outage Modeling in AnyLogic
AnyLogic is a state-of-the-art DES modeling tool/framework. It allows defining recurrent failures (not with follow-up recovery events, but only with follow-up repair activities) and maintenance for each Resource Pool. There is the option to use only the busy time of a resource for computing its next failure time. For repair activities, there is no possibility to distinguish between repair lead time and repair time. For maintenance activities, it can be defined which task priority they have and if they are preemptive or not. Neither for repair nor for maintenance activities, performer resources can be assigned.