As malicious threat actors learn from the recent outage to craft much-more-serious attacks to unleash on the world, take action NOW!
On 19 July 2024, the world suffered a chaotic weekend of disruptions in almost all major areas of banking, airport services, IT operations, hospital/healthcare activities and even broadcast schedules.
For a week thereafter, many affected organizations were still trying to undo the damage, especially those using BitLocker encryption on their systems. Many more incidents were probably never reported or publicized.
Considering that the crisis, dubbed by many in the cybersecurity industry as “the largest IT outage in history”, it is prudent to remember that even was not even committed by state-sponsored actors or cybercriminals. Now, imagine thousands of malicious threat actors now furiously learning from the incident to engineer a much larger cyber crisis soon…
Take proactive measures NOW
Learning from the mistakes and mitigations made in the incident, there are ways to minimize the impact that another major disruption like this could wreak on the world, according to Chris Drumgoole, Managing Director (Cloud & Infrastructure and Security Services), DXC Technology.
- Brush up on contingency planning: Prioritization is key, focusing on what is most critical for the business and repairing that first. Organizations should re-evaluate accepted practices for deploying software and granting update rights. The incident has underscored the need for robust testing, risk assessment and defined communication channels to prevent widespread disruptions and minimize the damage. This also means factoring your entire supply chain into contingency planning exercises, as third-party risk could many organizations during an outage or cyber threat.
- Around- the-clock commitment needed: IT outages do not necessarily occur on convenient, nine-to-five schedules. The global incident has reinforced the importance of maintaining a vigilant, 24/7 response capability to manage unforeseen emergencies. A commitment to continuous network monitoring, rapid incident response and resource management ensures timely restoration for affected customers.
- The human touch is essential: While technical solutions are imperative, particularly as the industry embraces an AI-led technology world, the human element still plays a pivotal role. The outage has highlighted how the IT industry is struggling to incorporate best practices for cloud-based IT infrastructure while keeping humans in the loop for testing the technology.
- Vendor relationships matter: Collaborating closely with service providers allows an organization to address the issue swiftly. Regular engagement with vendors outside of a crisis, understanding their update processes, and having direct lines of communication are also critical for effective incident response.
- Effective communication channels are key: Clear communication is essential during a crisis. From the fallout of the CrowdStrike incident, we have witnessed the importance of promptly informing customers about the situation, providing updates and managing expectations. Establishing reliable communication channels helps ensure transparency and minimizes confusion. Hearing from customers directly about their experience during the incident is especially useful for refining one’s response strategies for the future.
Further, other experts have stated that “relying on a single security solution is a recipe for disaster.”
Gartner has noted that “organizations should avoid vendor lock-in and instead adopt a best-of-breed approach to cybersecurity.”
While integrated EDR solutions can offer convenience and comprehensive coverage, they also introduce significant risks, as we have witnessed. Diversifying security measures and adopting a multi-layered approach can help mitigate these risks and reduce the blast radius of single major point of failure.
Patch promptly, but be circumspect
Finally, just when the world was slowly acclimatizing to frequent warnings by cybersecurity experts to “take software updates seriously” and “patch your systems as soon as possible upon the release of updates,” look what happened! Here are some takeaways about prompt software patching currently circulating on the Internet:
- Diversification of Security Solutions: Relying on a single point of protection can be risky. Using a combination of security solutions and practices provides a more resilient defense against potential threats. The goal is to avoid vendor sprawl while maintaining a diversified cyber posture.
- Test patches thoroughly and in stages before deployment: This helps identify potential issues that could cause widespread disruptions cascading through supply chains worldwide.
- Assume controlled ‘demolition’ is better: Implementing updates in stages, starting with a small subset of users, can help detect and address problems before they affect the entire user base. This approach allows for quick rollback if issues arise.
- Tighten backup and disaster recovery protocols: Organizations should have robust backup and recovery plans in place. Regularly backing up critical data in immutable form ensures that systems can be restored quickly if an update causes problems.
- Train all staff for surviving large-scale outages: Educating users about the importance of updates and how to handle potential issues can mitigate risks more quickly and calmly. Users should know how to report problems and seek support if any update causes disruptions.
- Seek tighter, binding vendor liability communication: Clear and transparent communication from vendors about the nature of updates, potential risks, and steps to take in case of issues is essential. This helps build trust and ensures users are prepared.
- Beef up monitoring and incident response protocols: Continuous monitoring of systems, establishing strong observability, and having a well-practiced incident response plan can help detect and address patching issues quickly.