MSP Security Services: The Importance of Incident Response
To get started in security services, there are two processes MSPs should develop before any others: protective monitoring and incident response. Both are coordinated through your security operations centre (SOC) and should be coupled to make them as effective as possible.
In this blog we’re going to explore how incident response can be integrated into your SOC monitoring service.
Incident Response 101 – understand the requirements
Figure 1 – Incident Management Process
If you are an MSP you will already provide incident management, even if it’s not specifically related to or labelled security incident management; it is a basic process in the IT Infrastructure Library (ITIL), defined as the process for detecting, logging, recording and resolving incidents. It doesn’t sound too different to managing a security incident. However, there are a few subtle differences you need to be aware of before launching a security incident management service.
First, the purpose of IT incident management is to restore services to an acceptable level (usually defined by the customer in a service level agreement – SLA) as quickly as possible. Sometimes, workarounds or temporary solutions are deployed until a more permanent solution can be found. ITIL’s official definition of an incident is, “An unplanned interruption to an IT service or reduction in the quality of an IT service.”
The problem with this definition and ITIL’s general approach is that security incidents sometimes need a different approach, since restoring a server or switching a gateway back on makes the incident worse (reinfection from ransomware or loss of data to an external attacker) or it might mean you lose vital evidence needed to determine what happened.
Security incident management therefore requires a better definition to ensure the outcomes are understood by all customers. The security incident management process identifies, manages and records the presence of cyber threats and follows by analysing and responding to these threats, with intent to reduce harm to the victim.
Security incidents can be anything from an active attack happening right now on your customers’ network, to an intrusion attempt or virus outbreak after an unsuspecting user opens a malicious email attachment. Corporate policy violations, unauthorised access to health, financial, social security numbers and intellectual property, and theft of personally identifiable information are all examples of security incidents that cause harm to the victims (be them in your customers’ organisation or their own customers outside of their organisation).
Security Incident Management – define the roles & responsibilities
Security incident management needs disciplined and repeatable procedures to ensure the way you handle incidents is consistent and aligned with the expectations of your customers. You should start by agreeing who is responsible for what aspects of the process, typically visualised in a RACI chart.
Figure 2 – The four elements of a RACI chart
These RACI parameters are attributed to each of the key roles in your stakeholder matrix:
- Responsible – the role that undertakes the activity;
- Accountable – the role that owns the activity and is accountable for it to the executive;
- Consulted – one of several roles that may have information relevant to the activity;
- Informed – one of several roles that should be informed of progress and outcomes.
The rest can be defined based on specific customer requirements and the structure of your SOC team.
The incident manager is the person in your team who oversees and prioritises the activities of the incident response team throughout the entire incident lifecycle: detection, analysis, containment and post incident review. Some teams also have an incident coordinator who runs the project coordination of the response, arranging meetings, taking copious notes and ensuring communications are timely, of quality and flowing.
Subject Matter Expertise and Security Engineering
Every incident response team also has a team of subject matter experts and security analysts that undertake activities under guidance from the incident manager. Subject matter experts are specialists in specific technologies or business areas, such as database administrators or healthcare systems specialists, while security analysts are the SOC team who run the tools, such as your Security Information and Event Management (SIEM) system and vulnerability management systems.
Legal and Business
Organisations will usually require that their internal legal counsel is informed during the incident management process, especially when the incident involves stolen data such as PII or intellectual property. Additional escalations may be necessary and certain business executives need to be informed of the incident when it’s deemed serious enough.
An escalation matrix (how you decide on the severity of the incident) determines when you escalate to each level of management all the way up to CEO and the board of directors. This also determines when you must inform third-parties, such as the privacy commissioner, law enforcement, the victims and the media.
How to Detect and Respond to an Incident
MSPs often ask how they can detect cyber incidents when threat actors use sophisticated evasion techniques to bypass normal monitoring. In fact, this is what many MSPs see as the primary barrier to entry when it comes to providing incident response services. However, the choice of tools used to establish the SOC makes that job much easier.
A security information and event management (SIEM) system provides the fundamental functionality every SOC needs. SIEMs ingest security event logs from the computers, network devices and applications running within your customer’s enterprise and allow your SOC to detect unusual activity that may be a threat.
Traditionally, SIEM systems were data stores for events, with search capabilities and a correlation engine that allows you to piece together disparate events and make sense of it. However, modern SIEM systems have evolved to become the nexus for all aspects of the incident management lifecycle, from detection to triage and notification, through to analysis and even supporting the forensics process and post incident review.
Modern, next-generation SIEMs ingest all the security information your customer systems produce and automatically profiles normal activity. Using state of the art artificial intelligence algorithms, patterns of normal behaviour are built over minutes, hours, days and even weeks, which can be used to detect changes in activity that are indicative of an attack.
For example, if a sudden increase in access requests to a critical server happens over a weekend, when the trend shows this is a quiet period, it may be that an attacker has entered the network and is exfiltrating customer data or intellectual property.
This kind of alert is something that triggers the incident response process, providing the security analyst with a variety of information artefacts, such as the affected service, its business criticality, source and destination IP addresses of all of the HTTP requests to an external web server, and may even provide access to the change management system so you can see whether a maintenance window is active and some kind of patching or update is underway on that system.
Armed with information provided by the SIEM, the analyst can escalate the incident to the incident manager and the process then proceeds according to the playbook for that category of incident.
Want to Read More?
 By rebooting a computer system, temporary files, memory resident programs and open connections are all lost. These can all be vital evidence and clues to what went on during a cyber-attack.