How to manage major incidents and security breaches
Most incidents that we deal with in IT are fairly minor. They may be important to any user who happens to be affected, but they don’t usually pose an existential threat to the business. The service desks that I work with are generally very effective at dealing with these incidents. They can identify what needs to be done quite quickly, they communicate well with affected users, and, since the volume of these incidents is quite high, they are able to learn from experience. They recognise that incidents offer many opportunities to improve and are proactive in prioritising and managing improvements. The major constraint on continual improvement in dealing with routine incidents isn’t lack of expertise or willingness to improve, it’s the availability of time and funding to analyse trends and make the improvements.
Major incidents and security breaches are different. They don’t occur often, many organizations may never have had to deal with them at all, and many of those that do find themselves dealing with a major issue are doing so for the first time. Learning from experience can turn out to be hugely expensive, or even result in the organization concerned going out of business.
So how can you make sure that you handle these incidents correctly first time?
The BBC recently published a fictional account of how not to deal with a security breach. The article, which is called Cyber-attack! Would your firm handle it better than this?, is worth reading and sharing for the insight it offers into the importance of planning how to respond to a security breach. There’s lots of good advice at the end of that article, and I think that the same ideas can be applied to any major incident, not just to security incidents.
Planning for an incident
The most important things you need to do to prepare for major incidents, including security breaches, include:
- Identify risks. Think about what might go wrong. This could involve telling stories, identifying threats and possible scenarios, keeping up with news of major IT incidents that have affected other organizations, etc. Don’t just think about security risks, what else might go wrong? What would be the impact on your business?
- Act to avoid the risk. The best way to manage any major incident is to take action so it doesn’t happen in the first place, and once you’ve identified what might go wrong, you are ideally placed to think about how you can stop the risk from happening. This might involve creating a process to keep security patches up to date, providing training to help staff avoid mistakes etc.
- Know when a risk has happened. Most serious risks get worse over time unless we identify them and take remedial action. Take, for example, a security breach. Many breaches have been exacerbated because they weren’t detected for many months, resulting in huge numbers of records being breached. If they had been detected quickly their impact could have been massively reduced. You may need to install and configure suitable tools to help you identify breaches; and you will need to remember that it’s equally important to train staff to report things that don’t look right. But don’t just think about breaches. What else could go wrong in a big way? How quickly could you detect it? Remember that the quicker you detect something going wrong, the sooner you can bring it under control.
- Plan your response. If you want to ensure that you take the best possible steps after a major incident you need to plan how you will respond before it happens; decisions made in the heat of the moment will not work as well as those that have been thought through in advance. Think about each of the risks you have identified and devise a plan for responding to it. Your plan should include:
- The immediate steps you will take to contain the issue
- What evidence about the incident you will need to collect, and how you will secure this evidence
- Which stakeholders you need to keep informed and how you will communicate with them
- Recovery steps
- Roles and responsibilities for decision making, technical actions, communication etc.
- Rehearse. Hopefully, you will have put in place plans to identify and eliminate many risks before they happen, so you’re not going to get many opportunities to try out your response plans and learn from experience. What this means is that you need to rehearse your response plans instead. Start with simple desktop rehearsals, where everyone involved sits round a table together and talks through what they would do. You can then move on to more sophisticated rehearsals but do be careful to ensure that the rehearsal doesn’t cause more disruption than the risk it is designed to prevent. Use your rehearsals as an opportunity to learn and improve, as well as to educate everyone so that they know what they are supposed to do in an emergency.
- Update and improve your plans. However good your plans are, they need to be maintained and continually improved. Even if your IT solution is stable, the business environment changes, and so does the threat environment. Keep reviewing and revising your plans to ensure that they remain fit for purpose.
Many organizations have IT service continuity plans, that are designed to deal with major disruption to the business. Ideally these plans are integrated with overall business continuity plans to ensure that all relevant areas are involved.
Some organizations include management of major incidents in their IT service continuity planning, but others reserve continuity planning for only the most major of business disruptions and rely on IT staff to manage slightly less serious incidents. In either case it is important that you plan for all the different things that might happen, and that everyone knows what they are supposed to do when things go wrong.
Incident management isn’t just about restoring service for users when they call the service desk. The IT organization needs to be prepared to deal with all sorts of events, ranging from minor user incidents to major business disruption. If you don’t plan how to manage major events and security breaches then the first one that you encounter could result in catastrophic consequences for you, your business, and your customers.
Image credit: sv1ambo