Incident Management:

From HORSE - Holistic Operational Readiness Security Evaluation.
Revision as of 16:40, 23 May 2007 by Mdpeters (talk | contribs) (→‎Investigation of incidents)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Incident Management

Incident Management otherwise known as Information Security Incident Management, is a Service Level Management process area. The first goal of the incident management process is to restore a normal service operation as quickly as possible and to minimize the impact on business operations, thus ensuring that the best possible levels of service quality and availability are maintained. 'Normal service operation' is defined here as service operation within Service Level Agreement (SLA). It is one process area within the broader ITIL environment.

Responsibilities and procedures

Management responsibilities and procedures should be established to ensure a quick, effective and orderly response to information security incidents.

Control includes:

  • Processes to ensure routine use of data from the ongoing monitoring of systems to detect events and incidents
  • Procedures specifically designed to respond to different types and severities of incident, including appropriate analysis and identification of causes, containment, communication with those actually or potentially affected by the incident, reporting of the incident to appropriate authorities, planning and implementation of corrective action to prevent reoccurrence as appropriate
  • Collection and use of audit trails and similar evidence as part of the incident management and investigation process, and appropriate management of this evidence for use in subsequent legal or disciplinary proceedings
  • Formal controls for recovery and remediation, including appropriate documentation of actions taken


Reporting information security events and weaknesses

This category aims to ensure information security events and weaknesses associated with the organization's information and information system assets are communicated in a manner to allow appropriate corrective actions to be taken.

Information security events should be reported through appropriate management channels as quickly as possible.

Reporting security weaknesses

All employees, contractors and third party users should be required to note and report any observed or suspected security weaknesses in systems or services as soon as possible.

Controls include:

  • Easy, accessible channels for reporting, the availability of which is clearly communicated to employees, contractors and third parties
  • Reasonable awareness on the part of employees, contractors and third parties of common signs and symptoms of security events
  • Reporting requirement extends to malfunctions or other anomalous events that might indicate a security weakness
  • Awareness on the part of employees, contractors and third parties that they should report, but not attempt to test, a suspected security vulnerability unless they have appropriate technical skills and an immediate response is required, since this might be interpreted as a potential misuse
  • Establishment of formal event reporting process(es) and procedure(s), setting out actions to be taken and points of contact
  • Awareness on the part of all employees, contractors and third-party users of the event-reporting process(es), including the requirement to report security events and weaknesses
  • Awareness of the requirement to report as quickly as possible, with sufficient detail to allow a timely response
  • Awareness of the prohibition on adverse action for reports made in good faith
  • Suitable feedback processes to ensure that those reporting events are appropriately notified of results


ITIL terminology defines an incident as:

"Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service". The stated ITIL objective is to "restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price".


Incidents may match with existing 'Known Problems' (without a known root cause) or 'Known Errors' (with a root cause) under the control of Problem Management and registered in the Known Error Database (KeDB.) Where existing work-arounds have been developed, it is suggested that accessing these will allow the Service Desk to provide a quick first-line fix. Where an incident is not the result of a Known Problem or Known Error, it may either be an isolated or individual occurrences or may (once the initial issue has been addressed) require that Problem Management become involved, possibly resulting in a new problem record being raised.

The main incident management processes are the following:

  • Incident detection and recording
  • Classification and initial support
  • Investigation and diagnosis
  • Resolution and recovery
  • Incident closure
  • Incident ownership, monitoring, tracking and communication


Incidents should be classified as they are recorded.

Investigation of incidents

Where disciplinary or legal action may be part of the follow-up to an information security incident, any investigation should be initiated in a manner that follows documented procedures and conforms to accepted practices.

Control includes:

  • Specifying what persons or classes of person may request an investigation, and on what basis
  • Specifying what persons or classes of person may initiate an investigation process, including collection of evidence
  • Specifying the necessary documentation to initiate an investigation, and the documentation required as the investigation proceeds
  • Procedures for securing and maintaining the integrity of investigatory records
  • Observing appropriate procedures to assure "chain of custody" for any information collected


Collection of evidence

Where an investigation has been initiated as part of possible disciplinary or legal action, evidence should be collected, retained and presented in a manner that follows documented procedures and conforms to accepted practices.

Control includes:

  • Specifying who may initiate an investigation, and on what basis
  • Specifying the necessary documentation to initiate an investigation, and the documentation required as the investigation proceeds
  • Securing and maintaining the integrity of copies of paper records, including "originals" if such exist
  • Securing and maintaining the integrity of copies of electronic records or other data on computer media relevant to the incident
  • Observing appropriate procedures to assure "chain of custody" for any information collected


Learning from information security incidents

There should be mechanisms in place to enable the types, volumes and costs of information security incidents to be quantified and monitored.

Control includes:

  • Routine sharing of data on information security incidents among the parties responsible for receiving reports and managing investigations
  • Periodic reports summarizing the data derived from this sharing


Examples of Incidents by classification:

  • Application
    • service not available
    • application bug
    • disk-usage threshold exceeded
  • Hardware
    • system-down
    • automatic alert
    • printer not printing
  • Service requests
    • request for information/advice/documentation
    • forgotten password


The incidents that cannot be resolved quickly by the Help desk will be assigned to specialist Technical Support groups. A resolution or work-around should be established as quickly as possible in order to restore the service.

Incidents are the result of failures or errors in the IT infrastructure . The cause of Incidents may be apparent and the cause may be addressed without the need for further investigation, resulting in a repair, a Work-around or a request for change (RFC) to remove the error.

Where an incident is considered to be serious in nature, or multiple occurrences of similar incidents are observed, a problem record might be created as a result (it's possible that the problem will not be recorded until several incidents have occurred). The management of a problem varies from the process of managing an incident and is typically performed by different staff and therefore is controlled by the problem management process. When a problem has been properly identified and a work-around is known, the problem becomes a 'known problem', when its 'root cause' has been identified, it becomes a 'known error'. Finally a request for change (RFC) may be raised to modify the system by resolving the known error, this process is covered by the Change Management process.

A request for new additional service is not regarded as an incident but as a Request for Change (RFC).

References

ISO-27002:2005 13.1.1
HIPAA 164.308(a)(6)
ISO-27002:2005 13.1.2
HIPAA 164.308(a)(6)
ISO-27002:2005 13.2.1
HIPAA 164.308(a)(6)
ISO-27002:2005 13.2.2
HIPAA 164.308(a)(1)(ii)(D)
HIPAA 164.308(a)(6)
ISO-27002:2005 13.2.3
HIPAA 164.308(a)(6)
ISO-27002:2005 13.2.3
HIPAA 164.308(a)(6)

Resources

  • ISO 17799/27002 - Code of Practice for Information Security Management.

External Links