Incident Management:: Difference between revisions

From HORSE - Holistic Operational Readiness Security Evaluation.
Jump to navigation Jump to search
(New page: ==Incident Management== '''Incident Management (ITSM)''' is an IT Service Management process area. The first goal of the incident management process is to restore a normal service ope...)
 
No edit summary
Line 1: Line 1:
==Incident Management==
=='''Incident Management==


'''Incident Management (ITSM)''' is an [[IT Service Management]] process area. The first goal of the incident management process is to restore a normal service operation as quickly as possible and to minimize the impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. 'Normal service operation' is defined here as service operation within [[Service Level Agreement]] (SLA). It is one process area within the broader [[ITIL]] environment.
'''Incident Management (ITSM)''' is a [[Service_Level_Management: | Service Level Management]] process area. The first goal of the incident management process is to restore a normal service operation as quickly as possible and to minimize the impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. 'Normal service operation' is defined here as service operation within [[Service Level Agreement]] (SLA). It is one process area within the broader [[ITIL]] environment.<br>
 
<br>
[[ITIL]] terminology defines an ''incident'' as:
'''[[ITIL]] terminology defines an ''incident'' as:'''<br>
:''Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service''. The stated ITIL objective is to ''restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price''
<br>
 
:"Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service". The stated ITIL objective is to "restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price".
Incidents may match with existing 'Known Problems' (without a known root cause) or 'Known Errors' (with a root cause) under the control of Problem Management and registered in the Known Error Database (KeDB). Where existing work-arounds have been developed, it is suggested that accessing these will allow the Service Desk to provide a quick first-line fix. Where an incident is not the result of a Known Problem or Known Error, it may either be an isolated or individual occurrences or may (once the initial issue has been addressed) require that Problem Management become involved, possibly resulting in a new problem record being raised.
<br>
 
Incidents may match with existing 'Known Problems' (without a known root cause) or 'Known Errors' (with a root cause) under the control of Problem Management and registered in the Known Error Database (KeDB.) Where existing work-arounds have been developed, it is suggested that accessing these will allow the Service Desk to provide a quick first-line fix. Where an incident is not the result of a Known Problem or Known Error, it may either be an isolated or individual occurrences or may (once the initial issue has been addressed) require that Problem Management become involved, possibly resulting in a new problem record being raised.<br>
The main incident management processes are the following:
<br>
'''The main incident management processes are the following:'''<br>
<br>
*Incident detection and recording
*Incident detection and recording
*Classification and initial support
*Classification and initial support
Line 15: Line 17:
*Incident closure
*Incident closure
*Incident ownership, monitoring, tracking and communication
*Incident ownership, monitoring, tracking and communication
 
<br>
Incidents should be classified as they are recorded, Examples of Incidents by classification:
'''Incidents should be classified as they are recorded.'''<br>
 
<br>
'''Examples of Incidents by classification:'''<br>
<br>
* Application
* Application
** service not available
** service not available
** application bug
** application bug
** disk-usage threshold exceeded
** disk-usage threshold exceeded
* Hardware
* Hardware
** system-down
** system-down
** automatic alert
** automatic alert
** printer not printing
** printer not printing
* Service requests
* Service requests
** request for information/advice/documentation
** request for information/advice/documentation
** forgotten password
** forgotten password
 
<br>
The incidents that cannot be resolved quickly by the Help desk will be assigned to specialist Technical Support groups. A resolution or work-around should be established as quickly as possible in order to restore the service.
The incidents that cannot be resolved quickly by the Help desk will be assigned to specialist Technical Support groups. A resolution or work-around should be established as quickly as possible in order to restore the service.<br>
 
<br>
Incidents are the result of failures or errors in the IT infrastructure . The cause of Incidents may be apparent and the cause may be addressed without the need for further investigation, resulting in a repair, a Work-around or a request for change (RFC) to remove the error.  
Incidents are the result of failures or errors in the IT infrastructure . The cause of Incidents may be apparent and the cause may be addressed without the need for further investigation, resulting in a repair, a Work-around or a request for change (RFC) to remove the error.<br>
 
<br>
Where an incident is considered to be serious in nature, or multiple occurrences of similar incidents are observed, a problem record might be created as a result but it is certainly possible that the Problem will not be recorded until several incidents have occurred. The management of a problem varies from the process of managing an incident and is typically performed by different staff and therefore is controlled by the problem management process. When a problem has been properly identified and a work-around is known, the problem becomes a 'known problem', when its 'root cause' has been identified, it becomes a 'known error'. Finally a request for change (RFC) may be raised to modify the system by resolving the known error, this process is covered by the Change Management (ITSM) process.  
Where an incident is considered to be serious in nature, or multiple occurrences of similar incidents are observed, a problem record might be created as a result (it's possible that the problem will not be recorded until several incidents have occurred). The management of a problem varies from the process of managing an incident and is typically performed by different staff and therefore is controlled by the problem management process. When a problem has been properly identified and a work-around is known, the problem becomes a 'known problem', when its 'root cause' has been identified, it becomes a 'known error'. Finally a request for change (RFC) may be raised to modify the system by resolving the known error, this process is covered by the [[Change_Management:|Change Management]] process.<br>
 
<br>
A request for new additional service is not regarded as an incident but as a Request for Change (RFC).
'''A request for new additional service is not regarded as an incident but as a Request for Change (RFC).'''<br>
<br>


==References==
==References==
Line 45: Line 48:


[[Category:Information technology management]]
[[Category:Information technology management]]
[[Category:Standards]]
[[Category:Method engineering]]

Revision as of 17:03, 22 March 2007

Incident Management

Incident Management (ITSM) is a Service Level Management process area. The first goal of the incident management process is to restore a normal service operation as quickly as possible and to minimize the impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. 'Normal service operation' is defined here as service operation within Service Level Agreement (SLA). It is one process area within the broader ITIL environment.

ITIL terminology defines an incident as:

"Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service". The stated ITIL objective is to "restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price".


Incidents may match with existing 'Known Problems' (without a known root cause) or 'Known Errors' (with a root cause) under the control of Problem Management and registered in the Known Error Database (KeDB.) Where existing work-arounds have been developed, it is suggested that accessing these will allow the Service Desk to provide a quick first-line fix. Where an incident is not the result of a Known Problem or Known Error, it may either be an isolated or individual occurrences or may (once the initial issue has been addressed) require that Problem Management become involved, possibly resulting in a new problem record being raised.

The main incident management processes are the following:

  • Incident detection and recording
  • Classification and initial support
  • Investigation and diagnosis
  • Resolution and recovery
  • Incident closure
  • Incident ownership, monitoring, tracking and communication


Incidents should be classified as they are recorded.

Examples of Incidents by classification:

  • Application
    • service not available
    • application bug
    • disk-usage threshold exceeded
  • Hardware
    • system-down
    • automatic alert
    • printer not printing
  • Service requests
    • request for information/advice/documentation
    • forgotten password


The incidents that cannot be resolved quickly by the Help desk will be assigned to specialist Technical Support groups. A resolution or work-around should be established as quickly as possible in order to restore the service.

Incidents are the result of failures or errors in the IT infrastructure . The cause of Incidents may be apparent and the cause may be addressed without the need for further investigation, resulting in a repair, a Work-around or a request for change (RFC) to remove the error.

Where an incident is considered to be serious in nature, or multiple occurrences of similar incidents are observed, a problem record might be created as a result (it's possible that the problem will not be recorded until several incidents have occurred). The management of a problem varies from the process of managing an incident and is typically performed by different staff and therefore is controlled by the problem management process. When a problem has been properly identified and a work-around is known, the problem becomes a 'known problem', when its 'root cause' has been identified, it becomes a 'known error'. Finally a request for change (RFC) may be raised to modify the system by resolving the known error, this process is covered by the Change Management process.

A request for new additional service is not regarded as an incident but as a Request for Change (RFC).

References

External links