Problem Management:

From HORSE - Holistic Operational Readiness Security Evaluation.
Jump to navigation Jump to search

Problem Management

The goal of Problem Management is to resolve the root cause of incidents and thus to minimize the adverse impact of incidents and problems on business that are caused by errors within the IT infrastructure, and to prevent recurrence of incidents related to these errors.

A `problem' is an unknown underlying cause of one or more incidents, and a `known error' is a problem that is successfully diagnosed and for which a work-around has been identified. The CCTA defines problems and known errors as follows:

A problem is a condition often identified as a result of multiple Incidents that exhibit common symptoms. Problems can also be identified from a single significant Incident, indicative of a single error, for which the cause is unknown, but for which the impact is significant.
A known error is a condition identified by successful diagnosis of the root cause of a problem, and the subsequent development of a Work-around.

Problem management is different from incident management. The principal purpose of problem management is find and resolve the root cause of a problem and prevention of incidents; the purpose of incident management is to return the service to normal level as soon as possible, with smallest possible business impact.

The problem management process is intended to reduce the number and severity of incidents and problems on the business, and report it in documentation to be available for the first-line and second line of the help desk. The proactive process identifies and resolves problems before incidents occur.

These activities are:

  • Trend analysis;
  • Targeting support action;
  • Providing information to the organization.

The Error Control Process is an iterative process to diagnose known errors until they are eliminated by the successful implementation of a change under the control of the Change Management process.

The Problem Control Process aims to handle problems in an efficient way. Problem control identifies the root cause of incidents and reports it to the service desk. Other activities are:

  • Problem identification and recording;
  • Problem classification;
  • Problem investigation and diagnosis.

The standard technique for identifying the root cause of a problem is to use an "fishbone diagram", also referred to as a cause-and-effect diagram, or tree diagram. It is typically the result of a brainstorming session in which members of a group offer ideas to improve a product. For problem-solving, the goal will be to find the cause and effect of the problem.

Fishbone diagrams can be defined in a meta-model.

First there is the main subject, it's the backbone of the diagram what we try to solve or improve, the main subject is derived from a cause. The relationship between a cause and an effect is a double relation: an effect is a result of a cause, and the cause is the root of an effect. But there is just one effect for several causes and one cause for several effects.